Go Faster!

C. J. Date

Go Faster! The TransRelational™ Approach to DBMS Implementation

2 Download free eBooks at bookboon.com

Go Faster! © 2002, 2011 C. J. Date & bookboon.com ISBN 978-87-7681-905-7


Go Faster!

It’s the best possible time to be alive, when almost everything you thought you knew is wrong — Tom Stoppard Will you go a little faster? — Lewis Carroll (more or less) ... trying not to let the joins show — Rudyard Kipling Good order is the foundation of all good things — Edmund Burke Only connect! — E. M. Forster … clever condensation, and accuracy, accuracy, accuracy — Joseph Pulitzer

—◆◆◆◆◆—

I’d like to dedicate this book to all of the people at Required Technologies, Inc., who have been working hard to turn the ideas described herein into a commercial reality. I hope this book does justice to their efforts.


Go Faster!

Contents

Contents

About the Author

11

Foreword

12

Preface

15

Part I: Preliminaries

18

1

“Go Faster!”

19

1.1 Introduction

360° thinking

1.2

TR Technology and the Relational Model

1.3

Model vs. Implementation

1.4

So How is it Done?

1.5

Structure of the Book

2

The Historical Context

.

2.1 Introduction 2.2 Ordering

19 20 24 26 28 31 31 36

360° thinking

.

360° thinking

.

Discover the truth at www.deloitte.ca/careers

© Deloitte & Touche LLP and affiliated entities.




Discover the truth5at www.deloitte.ca/careers Click on the ad to read more Download free eBooks at bookboon.com


D

Go Faster!

Contents

2.3

Indexing

38

2.4

Pointer Chains

44

2.5

Hashing

46

2.6

Data Compression

48

2.7

Concluding Remarks

50

3

Three Levels of Abstraction

53

3.1 Introduction

53

3.2

The Relational Level

55

3.3

The File Level

57

3.4

The TR Level

58

Part II: The Transrelational Model

61 NY026057B

TMP PRODUCTION

4

Core Concepts

6x4

12/13/2013 PSTANKIE

4.1 Introduction

gl/rv/rv/baf

4

62

ACCCTR0

62

Bookboon Ad Creative

4.2

The Crucial Idea

63

4.3

The Field Values Table

63

4.4

The Record Reconstruction Table

66

4.5

Building the Record Reconstruction Table

72

4.6

The Record Reconstruction Table is not Unique

77

All rights reserved.

© 2013 Accenture.

Bring your talent and passion to a global organization at the forefront of business, technology and innovation. Discover how great you can be. Visit accenture.com/bookboon


Click on the ad to read more

Go Faster! 5

Contents

Core Concepts (Continued)

80

5.1 Introduction

80

5.2

Some Remarks on Performance

80

5.3

TR Operators

83

5.4

Building the Record Reconstruction Table: An Alternative Approach

86

5.5

Record Reconstruction Revisited

88

5.6

Pointers are Field Value Surrogates

89

5.7

The Field Values Table is a Directory

90

5.8

Miscellaneous Implementation Alternatives

91

6 Implementing the Update Operators

93

6.1 Introduction

93

6.2

Overview

94

6.3

A Detailed Example

97

6.4

The Swap Algorithm

101

6.5

Using an Overflow Structure

105

6.6


106

7

Major-to-Minor Orderings

109

7.1 Introduction

109

7.2

The Suppliers-Parts-Projects Example

109

7.3

A Preferred Record Reconstruction Table

111

The Wake the only emission we want to leave behind

.QYURGGF'PIKPGU/GFKWOURGGF'PIKPGU6WTDQEJCTIGTU2TQRGNNGTU2TQRWNUKQP2CEMCIGU2TKOG5GTX 6JGFGUKIPQHGEQHTKGPFN[OCTKPGRQYGTCPFRTQRWNUKQPUQNWVKQPUKUETWEKCNHQT/#0&KGUGN6WTDQ 2QYGTEQORGVGPEKGUCTGQHHGTGFYKVJVJGYQTNFoUNCTIGUVGPIKPGRTQITCOOGsJCXKPIQWVRWVUURCPPKPI HTQOVQM9RGTGPIKPG)GVWRHTQPV (KPFQWVOQTGCVYYYOCPFKGUGNVWTDQEQO



Go Faster!

Contents

7.4

Building a Preferred Record Reconstruction Table

116

7.5

Another Example

118

7.6

Analysis

120

8

Condensed Columns

122

8.1 Introduction

122

8.2

Condensing the Field Values Table

123

8.3

Implications for Record Reconstruction

129

8.4

Expanding the Record Reconstruction Table

129

8.5

Further Space-Saving Techniques

132

9

Merged Columns

134

9.1 Introduction

134

9.2

The Bill-of-Materials Example

134

9.3

A Foreign Key Example

141

9.4

Another Kind of Merging

144

9.5

Concluding Remarks

144

10 Implementing the Relational Operators

146

10.1 Introduction

146

10.2 Restrict

148

10.3 Project

152

30 FR da EE ys tria

SMS from your computer ...Sync'd with your Android phone & number

l!

Go to

BrowserTexting.com

and start texting from your computer!

...


BrowserTexting


Go Faster!

Contents

10.4

Extend

155

10.5

Summarize

156

10.6 Join

159

10.7

Union, Intersect, and Difference

165

10.8

Materializing Derived Relations

167

10.9

A Note Regarding Optimization

168

10.10

A Note Regarding Constraints

169

10.11

What's Missing?

171

Part III: Disk-Based Implementation

174

11 General Disk Considerations

175

11.1 Introduction

175

11.2

What's the Problem?

176

11.3

Addressing the Problem

177

11.4

Compressing the Field Values Table

179

11.5

Compressing the Record Reconstruction Table

183

11.6

Minimizing Seeks

187

12

File Factoring

189

12.1 Introduction

189

12.2

190

A Simple Example

Brain power

By 2020, wind could provide one-tenth of our planet’s electricity needs. Already today, SKF’s innovative knowhow is crucial to running a large proportion of the world’s wind turbines. Up to 25 % of the generating costs relate to maintenance. These can be reduced dramatically thanks to our systems for on-line condition monitoring and automatic lubrication. We help make it more economical to create cleaner, cheaper energy out of thin air. By sharing our experience, expertise, and creativity, industries can boost performance beyond expectations. Therefore we need the best employees who can meet this challenge!

The Power of Knowledge Engineering

Plug into The Power of Knowledge Engineering. Visit us at www.skf.com/knowledge



Go Faster!

Contents

12.3

Elaborating on the Example

193

12.4

Further Possibilities

199

12.5

Record Reconstruction

203

12.6

Additional Benefits

206

13

File Banding

211

13.1 Introduction

211

13.2

A Simple Example

213

13.3


219

13.4

How it's Really Done

221

13.5

Controlled Redundancy

225

14

Stars and Zigzags

227

14.1 Introduction

227

14.2

A Simple Example

230

14.3


233

14.4

What Happens on Disk

236

14.5


237

Part IV: Conclusion

241

15 The Future Looks Bright Ahead

242

15.1

Introduction

242

15.2

The TR Model Summarized

242

15.3

Analysis

246

15.4

A Review of the Benefits

254

15.5

Possible Future Developments

261

Appendixes

266

Appendix A: Exercises

267

Appendix B: References and Bibliography

284


Go Faster!

About the Author

About the Author C. J. Date is an independent author, lecturer, researcher, and consultant, specializing in relational database technology. He is best known for his book An Introduction to Database Systems (8th edition, Addison-Wesley, 2004), which has sold some 850,000 copies at the time of writing and is used by several hundred colleges and universities worldwide. He is also the author of many other books on database management, including most recently: • From Addison-Wesley: Databases, Types, and the Relational Model: The Third Manifesto (3rd edition, coauthored with Hugh Darwen, 2006) • From Trafford: Logic and Databases: The Roots of Relational Theory (2007) • From Apress: The Relational Database Dictionary, Extended Edition (2008) • From O’Reilly: SQL and Relational Theory: How to Write Accurate SQL Code (2009) • From Trafford: Database Explorations: Essays on The Third Manifesto and Related Topics (coauthored with Hugh Darwen, 2010) Another book, Normal Forms and All That Jazz, is due for publication in the near future. Mr. Date was inducted into the Computing Industry Hall of Fame in 2004. He enjoys a reputation that is second to none for his ability to explain complex technical subjects in a clear and understandable fashion.


Go Faster!

Foreword

Foreword This book went through several drafts, all of them originally written in the period 2001‑2002, and all of them reviewed by persons knowledgeable in the subject area as well as by members of the intended target audience. The final version was completed in April 2002. But publication of that version was delayed for legal reasons, one of which was that it was subject to a Non Disclosure Agreement, an agreement that expired only quite recently (in September 2011, to be precise). So the text that follows was actually written almost ten years ago, and is thus necessarily out of date in certain respects. Despite this fact, I think it’s worth publishing, even now, for reasons that will quickly become apparent from the text itself—not to mention the fact that I think it’s a historical document, of a kind. Indeed, partly because of this latter fact, I made a deliberate decision not even to attempt to bring the text up to date. (I’ve made a few small editorial revisions, but they’re all purely cosmetic in nature. I’ve also added an “About the Author” section—see page 10—that’s current as of 2011, not 2002.) Please be aware, therefore, that: • Specific figures relating to CPU speeds, disk performance, etc. must all be understood to be as of 2002. (These remarks apply primarily to Chapter 11.) • Comments (such as they are—actually there are very few of them) regarding the capabilities of commercial DBMS products must also all be understood as referring to the products in question as they were in 2002. • For reasons beyond the scope of this book, the company (Required Technologies, Inc.) that owns the technology described in the body of the book is currently inactive, and the website www.requiredtech.com mentioned in the preface is currently dormant. Note: “Required” is an acronym, standing for Record Extraction, Query, and Update Interface with Re-Sort Everything Design. • A few of the references mentioned in Appendix B have been superseded by later versions. But none of these points is of any consequence as far as the technical message of the book is concerned. Now, my reasons for writing the book in the first place are explained in detail in the preface. What they boil down to, however, is that I was (and still am) extremely enthusiastic about the technology under discussion. And I’d like to take this opportunity to demonstrate that I’m far from being alone in my enthusiasm. Indeed, Ted Codd (inventor of the relational model) was highly enthusiastic, too—so much so that he wrote a letter of endorsement to Steve Tarin, inventor of the technology in question, and I’m pleased to be able to quote that letter here in its entirety (except for some minor items of a personal nature and some very minor editorial changes). It’s dated July 10th, 2000. I want to congratulate you for your brilliant work on the storage representation and management of data (U.S. Patent No. 6,009,432), titled Value-Instance-Connectivity Computer-Implemented Database (December 28th, 1999). The mathematical underpinnings and elegance of your approach coupled with the cleverness of its implementation are dazzling, and I have rarely in my life used terms such as these in a scientific context.


Go Faster!

Foreword

For more than 30 years, I have wished that somebody would tackle this problem and develop a solution. In some ways I was sorry that I did not tackle it myself. Now with your solution, I feel that the major part of my life’s work in relational database is complete. How I wish that we had your invention when the first RDBMSs were developed. It would have easily demolished the myth that relational systems could not perform well in high-performance transaction environments. Nevertheless, even today, I believe your work can revolutionize most, if not all, computer environments involving the management, storage, and access of data. It fits neatly into the original three-tier approach that I proposed in my relational writings. As you know, in my research work developing the relational model, I very deliberately took great care to cleanly separate the user view of the data from the logical representation of the data (the relations themselves) and, similarly, separated this logical representation from the physical representation of the data in storage media. I did this for a variety of reasons. The most important of these was to insulate application programs and queries from changes in: 1. physical representation of the data, 2. successive releases of a given RDBMS, 3. movement from one RDBMS to another (inter or intra vendor). I never specified how data was to be physically stored on media at all because I felt that additional research into the entire area of physical storage management was required, and since the three-tier scheme insulated applications from the representation of data in storage, this research could be ongoing and the results could be introduced into relational systems without impacting applications. Moreover, I also wanted to leave the door open for ingenuity and innovation among those vendors implementing RDBMSs. Unfortunately, in the 30+ years between the date of my first publication and the granting of your patent, the challenge of storage structures and management had largely been ignored. All of the existing RDBMSs in the marketplace simply stored the logical relations (tables) and added indexes, hashing, and other rather commonplace techniques. With the advent of your technology, the kind of solution I had hoped for is at hand. While the impact of your invention in the RDBMS arena (including the implementation of the relational operators) is self-evident, your technology can be extremely effective in the subfields of business intelligence (OLAP, data warehousing, and data mining), query/update systems, very high volume transaction systems, etc.


Go Faster!

Foreword

[Your] invention is so exceptional and relevant to my own past work that I would be delighted to participate … and assist you in getting your work accepted. My very best wishes for every success. / signed / E. F. Codd Now read on! C. J. Date Healdsburg, California September 2011

> Apply now redefine your future

- © Photononstop

AxA globAl grAduAte progrAm 2015

axa_ad_grad_prog_170x115.indd 1

19/12/13 16:36



Go Faster!

Preface

Preface The TransRelationaltm Model—“the TR model” for short—represents a radically new, exciting, and elegant approach to implementing database management systems (DBMSs). In fact, the TR model represents a specific application of a more general technology known as the Tarin Transform Method, which is intended as an implementation technology for computerized data storage and retrieval systems of all kinds (not just DBMSs). The Tarin Transform Method is the subject of a United States patent—see reference [63] in the list of references in Appendix B at the back of the book—and is the intellectual property of a company called Required Technologies, Inc. My aim in what follows is to introduce the Tarin Transform Method, to describe the TR model in detail, and to show what these fundamental new ideas are likely to mean for the way we do business in the IT world (IT = information technology). Required Technologies, Inc. has its headquarters at 130 West 42nd Street, Suite 2100, New York, New York 10036. The company was founded in January 1997. In the interests of full disclosure, I must make it clear right away that this book was written under a contract with Required Technologies; nevertheless, “the views expressed are my own,” as they say— indeed, I wouldn’t have signed the contract in the first place if I hadn’t been so profoundly impressed with the technology. History does repeat itself, sometimes. As you might know, I wrote a book some years back on the subject of business rules, entitled WHAT Not HOW: The Business Rules Approach to Application Development (reference [34] in the present book). In the preface to that earlier book, I said this: I must make it very clear right away that the book is not impartial. I’m very enthusiastic about business rules!— and I hope you will be too, when you’ve finished reading. In other words, this is definitely a book with an attitude: It explicitly champions the business rules idea, and it describes and explains what in my opinion are the merits and benefits of that idea. Why am I so enthusiastic? For two reasons: because of what the technology can do, and because it is so squarely in the spirit of “the original relational vision.” Replace the references to business rules in this extract by references to TransRelational technology instead, and the resulting text applies 100 percent to the book you’re looking at right now. Note: I’m using the term TransRelational technology here (“TR technology” for short) as a synonym for the TR model, and I’ll follow this practice throughout the book. Let me elaborate briefly on that matter of the original relational vision, though. In my considered opinion, the TR model is one of the most significant advances—quite possibly the most significant advance—in the data management field since Codd first invented the relational model, back in 1969. In particular, as I’ve more or less said already, TR technology provides us with a highly effective way to implement that model—a way that is dramatically different from ways that have been tried in the past and found wanting, including all of the ways encountered in today’s mainstream SQL products. And when I say “highly effective” here, I mean, among many other things, both of the following: • Such an implementation would be orders of magnitude faster than today’s SQL products. • Such an implementation would deliver a far greater degree of data independence than today’s SQL products.


Go Faster!

Preface

In fact, I believe TR technology will allow us to build DBMSs that will, at last, truly deliver on the full promise of the relational model. This is a very strong claim—stronger than you might realize, perhaps—but I stand by it. Aside: As a matter of fact, Required Technologies is due to release a commercial product based on TR technology round about the time this book appears in print. Now, it’s not the intent of the book to describe that product as such; however, you should be able to obtain information about that product—performance information in particular—by visiting the website www.requiredtech.com. End of aside. Who should read this book: The book is meant for anyone interested in the field of data management who wants to learn about a truly dramatic development in that field. Thus, the target readership includes, but is not limited to, the following: • Data and database management product implementers and other vendor personnel • Data, database, and system administrators • Data warehouse personnel • Data management and DBMS professionals of all kinds • Benchmark and performance specialists • Computer science professors specializing in data and database management issues • Database students, both graduate and undergraduate • People responsible for DBMS product evaluation and acquisition • Technically aware end-users The only prerequisites are an elementary appreciation of data management concepts, techniques, and issues, including some knowledge of the relational model in particular (though in fact most of the pertinent relational ideas are reviewed briefly at appropriate points in the book—see in particular Chapter 2, Section 2.1, and Chapter 10, Sections 10.2‑10.7). However, if you happen to be familiar with the book mentioned above on business rules [34], it’s only fair to warn you that the present book is pitched at a rather more detailed technical level than that earlier book was. How to read this book: Please note that the book is definitely meant to be read in sequence, not skipped or skimmed— except that you can omit Chapters 1 and 2, or portions thereof, if you’re already familiar with the material they cover (those chapters are mostly concerned with scene setting and other background matters). Also, I think I should say that, while the book overall is meant as a tutorial, some of the details in later chapters are a little tricky (that’s one reason why the material needs to be read in sequence). There are a few exercises embedded here and there in certain of those later chapters (and collected together for convenience in Appendix A), and I strongly recommend that you make the effort to do those exercises; I know from my own experience that they can help you understand the ideas very much better. There’s no substitute for working through detailed examples for yourself.


Go Faster!

Preface

That said, I should also say that the book isn’t meant to cover every last detail of the TR model. It’s a tutorial, not a work of reference, and the treatment is deliberately not exhaustive—in fact, it’s in the nature of the TR model that it’s open ended and extensible. All I’ve tried to do is include enough material for you to get a good feel for how TR technology is seriously different from what’s been tried before, and why it offers so many significant benefits. In particular, I’ve tried to highlight some of the differences between this new approach and what the lawyers call “prior art”—regarding how joins are implemented, for example. What’s more, the fact that I haven’t tried to be exhaustive has allowed me to introduce certain conceptual simplifications (not lies!) into the presentation; very importantly, it has also allowed me to focus on the insights and intuition underlying all of the technical detail, instead of just describing that technical detail per se. The overall structure of the book is explained in Section 1.5, at the end of Chapter 1. Acknowledgments: First of all, I’d like to thank the many people at Required Technologies, too numerous to mention individually, who supported me in various ways in the writing of this book. Second, the following people all reviewed earlier drafts of the text and made numerous helpful comments: Nagraj Alur, Lorrey James Bianchi, Charley Bontempo, Sharon Codd, Scott Cohen, Hugh Darwen, David McGoveran, Gary Morgenthaler, Fabian Pascal, Roberta Rousseau, Tom Sawyer, Steve Tarin, and Shelly Weinberg. A special vote of thanks goes to my old friend Sharon Codd for introducing me to Required Technologies in the first place, also to Gary Morgenthaler for drawing my attention to the hyperplane characterization discussed in Section 15.3, and to Roberta Rousseau for suggesting the idea of Appendix A. Thanks also to Addison-Wesley Publishing Company for permission to reuse certain earlier writings of mine as the basis for short portions of Chapters 1 and 2. C. J. Date Healdsburg, California July 2002

LIGS University based in Hawaii, USA is currently enrolling in the Interactive Online BBA, MBA, MSc, DBA and PhD programs:

▶▶ enroll by October 31st, 2014 and ▶▶ save up to 11% on the tuition! ▶▶ pay in 10 installments / 2 years ▶▶ Interactive Online education ▶▶ visit www.ligsuniversity.com to find out more!

Note: LIGS University is not accredited by any nationally recognized accrediting agency listed by the US Secretary of Education. More info here.



Go Faster!

Part I: Preliminaries


Go Faster!

“Go Faster!”

1 “Go Faster!” 1.1 Introduction There’s an old joke, well known in database circles, to the effect that what users really want (and always have wanted, ever since database systems were first invented) is for somebody to implement the go faster! command. Well, I’m glad to be able to tell you that, as of now, somebody finally has ... This book is all about a radically new database implementation technology, a technology that lets us build database management systems (DBMSs) that are “blindingly fast”—certainly orders of magnitude faster than any previous system. As explained in the preface, that technology is known as The TransRelationaltm Model, or the TR model for short (the terms TR technology and, frequently, just TR are also used). As also explained in the preface, the technology is the subject of a United States patent (U.S. Patent No. 6,009,432, dated December 28th, 1999), listed as reference [63] in Appendix B at the back of this book; however, that reference is usually known more specifically as the Initial Patent, because several follow-on patent applications have been applied for at the time of writing. This book covers material from the Initial Patent and from certain of those follow-on patents as well. The TR model really is a breakthrough. To say it again, it allows us to build DBMSs that are orders of magnitude faster than any previous system. And when I say “any previous system,” I don’t just mean previous relational systems. It’s an unfortunate fact that many people still believe that the fastest relational system will never perform as well as the fastest nonrelational system. Indeed, it’s exactly that belief that accounts in large part for the continued existence and use of older, nonrelational systems such as IMS [25,57] and IDMS [14,25], despite the fact that—as is well known—relational systems are far superior from the point of view of usability, productivity, and the like. However, a relational system implemented using TR technology should dramatically outperform even the fastest of those older nonrelational systems, finally giving the lie to those old performance arguments and making them obsolete (not before time, either). I must also make it clear that I don’t just mean that queries should be faster under TR (despite the traditional emphasis in relational systems on queries in particular)—updates should be faster as well. Nor do I mean that TR is suitable only for decision support systems—it’s eminently suitable for transaction processing systems, too (though it’s probably fair to say that TR is particularly suitable for systems in which read-only operations predominate, such as data warehouse and data mining systems). And one last preliminary remark: You’re probably thinking that the performance advantages I’m claiming must surely come at a cost: perhaps poor usability, or less functionality, or something (there’s no free lunch, right?). Well, I’m pleased to be able to tell you that such is not the case. The fact is, TR actually provides numerous additional benefits, over and above the performance benefit—for example, in the areas of database and system administration. Thus, I certainly don’t want you to think that performance is the only argument in favor of TR. We’ll take a look at some of those additional benefits in Chapters 2 and 15, and elsewhere in passing. (In fact, a detailed summary of all of the TR benefits appears in Chapter 15, in Section 15.4. You might like to take a quick look at that section right now, just to get an idea of how much of a breakthrough the TR model truly is.)


Go Faster!

1.2

“Go Faster!”

TR Technology and the Relational Model

As I said in the preface, I believe TR technology is one of the most significant advances—quite possibly the most significant advance—in the data management field since E. F. Codd first invented the relational model (which is to say, since the late 1960s and early 1970s; see references [5‑7], also reference [35]). As I also said in the preface, TR represents among other things a highly effective way to implement the relational model, as I hope to show in this book. In fact, the TR model—or, rather, the more general technology of which the TR model is just one specific but important manifestation—represents an effective way to implement data management systems of many different kinds, including but not limited to the following: • SQL DBMSs

• Data warehouse systems

• Information access tools

• Data mining tools

• Object/relational DBMSs

• Web search engines

• Main-memory DBMSs

• Temporal DBMSs

• Business rule systems

• Repository managers

• XML document storage and retrieval systems

• Enterprise resource planning tools

as well as relational DBMSs in particular. Informally, we could say we’re talking about a backend technology that’s suitable for use with many different frontends. In planning this book, however, I quickly decided that my principal focus should be on the application of the technology to implementing the relational model specifically. Here are some of my reasons for that decision:



Go Faster!

“Go Faster!”

• Concentrating on one particular application should make the discussions and examples more concrete and therefore, I hope, easier to follow and understand. • More significantly, the relational model is of fundamental importance; it’s rock solid, and it will endure. After all, it really is the best contender, so far as we know, for the role of “proper theoretical foundation” for the entire data management field. One hundred years from now, I fully expect database systems still to be firmly based on Codd’s relational model—even if they’re advertised as “object/relational,” or “temporal,” or “spatial,” or whatever. See Chapter 15 for further discussion of such matters. • If your work involves data management in any of its aspects, then you should already have at least a nodding acquaintance with the basic ideas of the relational model. Though I feel bound to add that if that “nodding acquaintance” is based on a familiarity with SQL specifically, then you might not know as much as you should about the model as such, and you might know “some things that ain’t so.” I’ll come back to this point in a few moments. • The relational model is an especially good fit with TR ideas; I mean, it’s a very obvious candidate for implementation using those ideas. Why? Because the relational model is at a uniform, and high, level of abstraction; it’s concerned purely with what a database system is supposed to look like to the user, and has absolutely nothing to say about what the system might look like internally. As many people would put it, the relational model is logical, not physical. Let me elaborate on this point for a moment. Rather than saying it’s logical, not physical, my own preference— since the terms “logical” and “physical” aren’t very precisely defined—would just be to say that the relational model is indeed a model (a data model, that is) and is thus, by definition, not concerned with implementation issues. (I’ll have more to say on the difference between model and implementation in the next section.) Anyway, however you might like to express the fact, it’s certainly the case that the relational model emphasizes, far more than other data models do, the crucial distinction between different levels of the system—in particular, the distinction between the model or external (user) level and the implementation or internal (system) level. That’s why it’s a good fit with TR technology. Other data models—for example, the “object model” [3,4] or the “hierarchic model” [25,57] or the CODASYL “network model” [14,25]—muddy the distinction between those levels considerably. As a consequence, those other models give implementers far less freedom (far less than the relational model does, I mean) to adopt inventive or creative approaches to questions of implementation. Note: I put the terms “object model,” “hierarchic model,” and “network model” in quotation marks in the foregoing paragraph because there’s considerable doubt as to whether those “models” are truly models at all, at least in the sense that the relational model is a model (see, for example, references [28] and [29] for further discussion of this point). Certainly most of those other “models” are quite ad hoc, instead of being firmly founded, as the relational model is, in set theory and formal logic. As I’ve already suggested, those other “models” also fail, much of the time, to make a clear separation between issues that truly are model issues and ones that are better regarded as implementation matters. Again, I’ll have more to say on this topic in the next section.


Go Faster!

“Go Faster!”

And one further point: Although TR is an implementation technology, and thus definitely at a lower level of abstraction than the relational model, it’s important to understand that it can still, like the relational model, be regarded as abstract to a degree (as indeed the very term “TR model” implies). In particular, it resembles the relational model in that it can be physically implemented in a variety of different ways. See Chapter 3 and several subsequent chapters for further discussion of this possibility. • In my very firm opinion, the relational model is the right and proper foundation on which to build sound solutions to a variety of newer data management problems. Examples of such newer problems include userdefined data type support [40], subtyping and type inheritance support [41], and temporal data support [42]. Thus, if TR is a good basis for implementing the relational model, it follows that it should be a good basis for implementing solutions to those newer problems, too. Actually, there’s quite a lot more to be said in connection with this business of using the relational model as a vehicle for explaining TR ideas. First of all, please note that I do mean the relational model, not SQL. SQL and the relational model aren’t the same thing! Indeed, considered as a concrete realization of the abstract relational model, SQL is very seriously flawed. This isn’t the place to go into details on this particular issue; suffice it to say that the SQL language suffers from far too many sins, of both omission and commission, for it ever to be honestly labeled “truly relational.” (For more specifics, see references [15‑17], [19], [31], and [39], among others.) As a consequence, SQL is not at all suitable as a foundation for explaining TR ideas (or numerous other ideas, come to that), which is why I don’t want to use it for that purpose in this book. Another problem with SQL, possibly less serious but still significant, has to do with terminology. SQL terms are often quite actively misleading—a fact that again makes SQL unsuitable as a basis for explaining TR and other ideas. However, I will at least try to relate TR concepts and facilities to SQL constructs and terms, and I’ll show examples in SQL, whenever it seems to me to make sense to do so. In connection with the foregoing, I should add that I’ll be basing all of my SQL examples on the official SQL standard [53]. A detailed tutorial on that standard (1992 version) is given in reference [39], while a brief overview of the extensions that were added to form the current (1999) version can be found in reference [47]. As you might know, however, no DBMS on the market fully supports even the 1992 version of the official standard—in fact, no DBMS could fully support it, owing to the many contradictions and inconsistencies it contains (see Appendix D of reference [39])—and so the examples might not always work exactly as advertised on your own favorite SQL product. Caveat lector. While I’m on the subject of the SQL standard, by the way, let me add that the official standard pronunciation of the name “SQL” is “ess-cue-ell,” though you’ll often hear it pronounced “sequel.” In this book, I’ll favor the official pronunciation, thereby talking in terms of, for example, an SQL example instead of a SQL example.


Go Faster!

“Go Faster!”

Back to TR. Yet another important reason for explaining TR in terms of its usefulness for implementing the relational model specifically is that TR offers the possibility of building a DBMS product that truly is relational—something that, precisely because of the SQL shortcomings mentioned above (and contrary to popular belief, perhaps), has never yet been done. In other words, the potential benefits of the relational model, though well known and paid much lip service to, have never been fully realized (despite the dominance of so-called “relational” DBMSs in the marketplace), because the relational model has never been properly implemented. Now, however, we have the chance to do it right—and I very much hope that someone will be bold enough to take up this particular challenge as soon as possible. Following on from the previous point, let me focus for a moment on one very significant “potential benefit of the relational model”: data availability and accessibility. It was always a dream of relational advocates that end-users should be able to query and even update the database directly, without having to go through the potential bottleneck of the IT department (IT = information technology). After all, the data in the database really does belong to those end-users, not to the IT department. But this goal was never properly achieved, because of performance concerns: Database administrators were worried—with good reason—that it would be all too easy for an end-user to issue a request that would bring the system to its knees (“the query that dims the lights”). Thus, all kinds of barriers had to be put in place to prevent the real users from getting direct access to their own data: security controls, time-of-day lockouts, performance monitors, query governors, and other mechanisms. (And all of those mechanisms in turn required further administration of their own, of course, making the database administrator’s job still harder.) But if performance isn’t a problem—that is, if the claims regarding TR performance are indeed valid—then those mechanisms shouldn’t be necessary, and we should be able, at last, to achieve the data availability and accessibility goal.



Go Faster!

“Go Faster!”

And one last point: Despite the foregoing criticisms of today’s SQL products, another potential application for TR technology arises precisely in connection with those products. To be more specific, it should be possible, at least in principle, to replace the backend code in such a product by code that uses TR technology instead. The user interface—namely, SQL—to the system would remain unchanged; the only change the user would see would be that the system would now run much faster than before. (The database administrator would see a change too, in that the administration job would now be much easier.)

1.3

Model vs. Implementation

Note: This section is based on material that originally appeared in reference [34], pages 33‑35, copyright (c) 2000 Addison Wesley Longman Inc. The material is reused here by permission of Pearson Education Inc. Before I go any further, I need to say a little more about the notion of models—more precisely, data models—in general. I also need to say more about the difference between such models and their implementation (what reference [40] calls one of the great logical differences1) in particular. And I need to head off at the pass a certain confusion that might otherwise get in the way of understanding. The fact is, the term data model is, very unfortunately, used in the database community with two quite different meanings, and we need to be clear as to which of those two meanings is intended in any particular context. The first meaning is the one we have in mind when we talk about, for example, the relational model in particular. It can be defined as follows: Data model (first sense): An abstract, self-contained, logical definition of the objects, operators, and so forth, that together make up the abstract machine with which users interact. The objects allow us to model the structure of data. The operators allow us to model its behavior. Please note, incidentally, that I’m using the term objects here in its generic sense, not in the special rather loaded sense in which it’s used in the world of “object orientation” and “the object model” [3,4]. And then—very important!—we can usefully go on to distinguish the notion of a data model as just defined from the associated notion of an implementation, which can be defined as follows: Implementation: The physical realization on a real machine of the components of the abstract machine that together constitute the data model in question. For example, consider the relational model. The concept relation itself is, naturally, part of that model: Users have to know what relations are, they have to know they’re made up of tuples and attributes,2 they have to know what they mean (that is, how to interpret them), and so on. All that is part of the model. But they don’t have to know how relations are physically stored inside the system, they don’t have to know how individual data values are physically encoded, they don’t have to know what indexes or other physical access paths exist, and so on; all that is part of the implementation, not part of the model.


Go Faster!

“Go Faster!”

Or consider the concept join. The join operator is part of the relational model: Users have to know what a join is, they have to know how to invoke a join, they have to know what the input and output relations look like, and so on. Again, all that is part of the model. But users don’t have to know how joins are physically implemented—they don’t have to know what expression transformations take place under the covers, they don’t have to know what indexes or other physical access paths are used, they don’t have to know what physical I/O operations are executed,3 and so on; all that is part of the implementation, not part of the model. In a nutshell, therefore: The model, in the first sense of the term, is what the user has to know; the implementation is what the user doesn’t have to know. (Just to elaborate for a moment: Of course, I don’t mean that users aren’t allowed to know about the implementation. They might indeed know something about it; they might possibly even use the model better if they do; but, to repeat, they don’t have to know about it.) Now let’s turn to the second meaning of the term data model, which can be defined as follows: Data model (second sense): A model of the persistent data of some particular enterprise. Examples might include a model of the persistent data for some bank, or some hospital, or some government department. By the way, there’s a nice analogy here that I think can help clarify the relationship between the two meanings of the term: • A data model in the first sense is like a programming language, whose constructs can be used to solve many specific problems, but in and of themselves have no direct connection with any such specific problem. • A data model in the second sense is like a specific program written in that language—it uses the facilities provided by the model, in the first sense of that term, to solve some specific problem. Having now, I hope, made clear the distinction between the two meanings, I can now be explicit and say that throughout the rest of this book, I’ll be using the term data model in its first (“abstract machine”) sense. What’s more, I’ll usually abbreviate the term data model to just model, unqualified; that is, I’ll take the term model, unqualified, to mean a data model specifically (barring explicit statements to the contrary, of course).


Go Faster!

1.4

“Go Faster!”

So How is it Done?

Back now to TR specifically. What then is the crucial difference between the TR approach and previous approaches to implementing the relational model? In a nutshell, it’s this: • Previous approaches have typically failed to recognize (or at least to act on) the clean separation between model and implementation that the relational model makes possible. In those systems, what the user sees and what’s stored internally are, typically, very similar to one another; typically, there’s a simple one-to-one correspondence between base relations as seen by the user and files as stored internally,4 and a simple oneto-one correspondence between the tuples and attributes in such relations and the records and fields in such stored files as well (see Fig. 1.1). In other words, what’s physically stored is effectively just a direct image of what the user logically sees.

Fig. 1.1: Direct-image implementation



Go Faster!

“Go Faster!”

But that direct-image style of implementation has many undesirable consequences. One of the most important is that the tuples of the relation in question are effectively kept in just one physical sequence (that is, one “stored sort order”—see Fig. 1.1 again), and certain auxiliary structures, typically indexes, therefore have to be built and maintained in order to provide access to those tuples in any other sequence. Those auxiliary structures in turn lead to numerous further problems, including among other things stored data redundancy, additional storage space requirements, DBMS implementation complexity, physical and logical database design complications, and query and update inefficiencies and overheads. I’ll elaborate on these matters in Chapter 2. • In the TR approach, by contrast, what’s physically stored is very far from being a direct image of what the user logically sees. Instead, the relations and tuples seen by the user are transformed into internal structures that eliminate virtually all stored data redundancy and provide many stored sort orders simultaneously. Furthermore, the transformation is done without incurring large overheads in either space or time: The transform process is rapid in both directions, and the internal structures occupy a fraction of the storage space—a figure of 20 percent is quite typical—that would be needed for the data if it were kept in raw direct-image form. (Observe, therefore, that TR is an improvement over previous approaches in terms of both space and time: Faster execution times aren’t achieved at the cost of additional storage space—quite the opposite, in fact.) And now, perhaps a little belatedly, I can explain what the term “transrelational” means. The usual meaning of “trans” is across, beyond, or through. But the “trans” in “transrelational” doesn’t stand for any of these; rather, it stands for transform or transformed, and it refers to the fact that, in TR, data as seen by the user—in other words, relational data—is transformed into very different internal representations, representations that are much more suitable for internal processing purposes. Thus, TR certainly doesn’t go “beyond” the relational model in the sense that it adds new logical data structures and operators to that model; rather, it goes “beyond” that model in that it introduces constructs that are explicitly oriented toward efficient implementation: constructs, in other words, that are beyond the purview of the relational model by definition. Precisely because TR does transform the data as seen by the user instead of storing it in direct-image form, from time to time I’ll talk in what follows in terms of “transform” technology explicitly, thereby highlighting the fundamental distinction between TR and the traditional direct-image approach. In Parts II and III of this book, I’ll explain the TR transform process in detail; then, in Part IV, I’ll step back from that level of detail and consider the fundamental significance of the transform idea. Now, it’s obviously impossible to be very specific with respect to the advantages of transform technology at such an early stage in the book. However, let me just say that I see a fruitful analogy with logarithms.5 As we all know, logarithms allow what would otherwise be complicated, tedious, and time-consuming numeric problems to be solved by transforming them into vastly simpler but (in a sense) equivalent problems and solving those simpler problems instead. Well, it’s my claim that—as I hope to show in the body of the book—TR technology does the same kind of thing for data management problems. Reference [63] summarizes the distinction between TR and previous approaches (or in other words the transform vs. direct-image distinction) as follows:


Go Faster!

“Go Faster!”

Rather than [achieving] orderedness through increasing redundancy (that is, superimposing an ordered data representation on top of the original unordered representation of the same data), the present invention achieves orderedness through eliminating redundancy on a fundamental level. —from the Initial Patent In what follows, we’ll see in detail exactly how these ideas are realized in practice.

1.5

Structure of the Book

The book overall is divided into four parts, plus two appendixes. A sketch of the contents follows. Part I Part I consists of three chapters. Following this initial chapter, Chapter 2 takes a look at the historical context; in particular, it explains the concept of direct-image implementation in more detail, and it discusses some of the problems that arise with such implementations. Chapter 3 then describes a conceptual framework, based on three levels of abstraction, that serves as a basis for explaining TR ideas in detail. That framework is assumed throughout the rest of the book. Part II Part II (seven chapters) describes the TR model. Chapters 4 and 5 in a sense form the heart of the book; they explain the two fundamental constructs of the TR model, the Field Values Table and the Record Reconstruction Table, very carefully and in considerable detail. Everything that follows builds on the ideas of these two chapters, and I recommend that you read them both as carefully as you can. In particular, they both include a number of embedded exercises, and I suggest very strongly that you attempt all of them. Working through those exercises will give you a good feel for how the fundamental TR algorithms really work—a much better feel than you can possibly get from simply reading the text. Next, Chapter 6 addresses the issue of updates,6 a topic that Chapters 4 and 5 scarcely consider at all (deliberately, of course). Chapters 7‑9 then go on to discuss some major refinements to the basic model as described in Chapters 4, 5, and 6. Strictly speaking, the refinements in question are indeed just that, refinements, and therefore optional, but it seems to me that most if not all of them would surely be included in any commercial implementation of the TR model. What’s more, several of the more significant and interesting benefits of the TR model are direct consequences of those refinements. These chapters also all include embedded exercises, and again I recommend that you take those exercises seriously. The last chapter in Part II, Chapter 10, discusses the use of the TR model in implementing the operators of the relational model (restrict, project, join, and so forth), showing how radically different those implementations are from what we’re used to seeing in traditional direct-image systems.


Go Faster!

“Go Faster!”

Part III Divide-and-conquer is always a good pedagogical approach, and this book makes heavy use of it. In particular, Part II assumes (for the most part, at any rate) that the database is in main memory, and it ignores the complications that are introduced by the fact that real databases are usually too big to fit into memory.7 Part III then goes on to consider what happens when we drop this assumption. Chapter 11 describes the problem in general terms; Chapters 12‑14 then go on to discuss three highly TR-specific solutions to that general problem. By the way, the point is worth making that, the foregoing paragraph notwithstanding, main-memory databases are becoming increasingly important in practice, and commercial products are becoming available that are optimized for such databases. The TR model is an excellent basis on which to build such products, as you’d probably expect. Part IV Part IV consists of a single wrap-up chapter (Chapter 15); it provides a summary and analysis of what’s been covered in earlier chapters, including in particular a summary of the benefits the TR model provides, and it offers a brief look at what the future might hold.

93%

OF MIM STUDENTS ARE WORKING IN THEIR SECTOR 3 MONTHS FOLLOWING GRADUATION

MASTER IN MANAGEMENT • STUDY IN THE CENTER OF MADRID AND TAKE ADVANTAGE OF THE UNIQUE OPPORTUNITIES THAT THE CAPITAL OF SPAIN OFFERS • PROPEL YOUR EDUCATION BY EARNING A DOUBLE DEGREE THAT BEST SUITS YOUR PROFESSIONAL GOALS • STUDY A SEMESTER ABROAD AND BECOME A GLOBAL CITIZEN WITH THE BEYOND BORDERS EXPERIENCE

5 Specializations

Personalize your program

www.ie.edu/master-management

#10 WORLDWIDE MASTER IN MANAGEMENT FINANCIAL TIMES

[email protected]


Length: 1O MONTHS Av. Experience: 1 YEAR Language: ENGLISH / SPANISH Format: FULL-TIME Intakes: SEPT / FEB

55 Nationalities

in class

Follow us on IE MIM Experience


Go Faster!

“Go Faster!”

Appendixes Finally, there are two appendixes: one collecting together all of the exercises from Part II (Appendix A), and one giving a consolidated set of references for the entire book (Appendix B). Appendix A in particular is provided as a convenient place where you might actually want to work the exercises; not only does it contain the exercise statements as such, it also repeats some of the necessary background material, and it should thus save you from having to do a lot of tedious page flipping and cross-referencing while you’re trying to work out your answers.

Endnotes 1. This useful term comes from Wittgenstein’s dictum that All logical differences are big differences. For further discussion, see reference [40]. 2. In case you’re not familiar with these terms (or the term relation itself, come to that, or other related terms), they’ll all be explained in Chapter 2. Here just let me note that tuple is usually pronounced to rhyme with “couple.” 3. I/O = input/output. I’m assuming here that the data is physically stored on secondary storage media (magnetic disks, etc.). 4. I’ll explain the difference between base relations and other kinds in Chapter 2. In the interests of accuracy, I should also mention that the correspondence between base relations and stored files isn’t always one-to-one as I’m claiming here—some products allow several base relations to share the same stored file, and some allow a single base relation to span several stored files. However, these facts don’t significantly affect the bigger picture, and ignoring them (as I plan to do from this point forward) doesn’t materially affect any of the arguments I’m going to be making. 5. Thanks to Steve Tarin for suggesting this analogy. 6. Here and throughout this book, I follow convention in using the term update to refer to the INSERT, DELETE, and UPDATE operators considered generically. If I need to refer to the UPDATE operator specifically, I’ll set it in all caps, as here. 7. Here and throughout this book, I follow convention in using the unqualified term memory to mean main memory specifically.


Go Faster!


2 The Historical Context 2.1 Introduction The main purpose of this chapter is to explain in more detail some of the problems that arise in connection with what the lawyers call “prior art”—meaning, in the case at hand, systems that use the traditional direct-image approach to implementation. Of course, you can skip this material if you’re already familiar with conventional implementation technology. However, this first section does also introduce a few simple relational ideas, and you might at least want to make sure you’re familiar with those and fully understand them. Consider Fig. 2.1, which depicts a relation called S (“suppliers”). Observe that each supplier has a supplier number (S#), unique to that supplier;1 a supplier name (SNAME), not necessarily unique (though in fact the sample names shown in the figure do happen to be unique); a rating or status value (STATUS); and a location (CITY). I’ll use this example to remind you of a few of the most fundamental relational terms and concepts.

Fig. 2.1: The suppliers relation S

• First of all, a relation can, obviously enough, be pictured as a table. However, a relation is not a table.2 A picture of a thing isn’t the same as the thing! In fact, the difference between a thing and a picture of that thing is another of the great logical differences (see the remarks on this latter notion in Chapter 1, near the beginning of Section 1.3). One problem with thinking of a relation as a table is that it suggests that certain properties of tables—for example, the property that the rows are in a certain top-to-bottom order—apply to relations too, when in fact they don’t (see below). • Each of the five suppliers is represented by a tuple (pronounced as noted in Chapter 1 to rhyme with “couple”). Tuples are depicted as rows in figures like Fig. 2.1, but tuples aren’t rows. • Each supplier tuple contains four values, called attribute values; that is, the suppliers relation involves four attributes, called S#, SNAME, STATUS, and CITY. Attributes are depicted as columns in figures like Fig. 2.1, but attributes aren’t columns.


Go Faster!


• Attributes are defined over data types (types for short, also known as domains), meaning that every value of the attribute in question is required to be a value of the type in question. Types can be either system-defined (built in) or user-defined. For example, attribute STATUS might be defined over the system-defined type INTEGER (STATUS values are integers), while attribute SNAME might be defined over the user-defined type NAME (SNAME values are names). Note: For definiteness, I’ll assume these specific types throughout what follows, where it makes any difference. I’ll also assume that attribute S# is defined over a user-defined type with the same name (that is, S#), and attribute CITY is defined over the system-defined type CHAR (meaning character strings of arbitrary length). • The tuples of a relation are all distinct. In fact, relations never contain duplicate tuples—the tuples of a relation form a mathematical set, and sets in mathematics don’t contain duplicate elements. Note: People often complain about this aspect of the relational model, but in fact there are good practical reasons for not permitting duplicate tuples. A detailed discussion of the point is beyond the scope of this book; see any of references [13], [20], or [33] if you want to pursue the matter. • There’s no top-to-bottom ordering to the tuples of a relation. Although figures like Fig. 2.1 clearly suggest there is such an ordering, there really isn’t—to say it again, the tuples of a relation form a mathematical set, and sets in mathematics have no ordering to their elements. Note: It follows from this point that we could draw several different pictures that would all represent the same relation. An analogous remark applies to the point immediately following.

DO YOU WANT TO KNOW:

What your staff really want?

The top issues troubling them?

How to make staff assessments work for you & them, painlessly?

How to retain your top staff

Get your free trial

FIND OUT NOW FOR FREE

Because happy staff get more done



Go Faster!


• There’s no left-to-right ordering to the attributes of a relation. Again, figures like Fig. 2.1 clearly suggest there is such an ordering, but there really isn’t; like the tuples, the attributes of a relation form a set, and thus have no ordering. (By the same token, there’s no left-to-right ordering to the components of a tuple, either.) No relation can have two or more attributes with the same name. • The suppliers relation is in fact a base relation specifically. In general, we distinguish between base and derived relations; a derived relation is one that is derived from, or defined in terms of, other relations, and a base relation is one that isn’t derived in this sense. Loosely speaking, in other words, the base relations are the “given” ones—they’re the ones that make up the actual database—while the derived ones are views, snapshots, query results, and the like [33]. For example, given the base relation of Fig. 2.1, the result of the query “Get suppliers in London” is a derived relation that looks like this:

Another way to think about the distinction is that base relations exist in their own right, while derived ones don’t—they’re existence-dependent on the base relations. • Every relation has at least one candidate key (or just key for short), which serves as a unique identifier for the tuples of that relation. In the case of the suppliers relation (and the derived relation just shown as well), there’s just one key, namely {S#}, but relations can have any number of keys, in general. Note: It’s important to understand that keys are always sets of attributes (though the set in question might well contain just a single attribute). For this reason, in this book I’ll always show key attributes enclosed in braces, as in the case at hand—braces being used by convention to bracket the elements that make up a set. • As you probably know, it’s customary (though not obligatory) to choose, for any given relation, one of that relation’s candidate keys—possibly its sole candidate key—as primary; thus, for example, we might say in the case of the suppliers relation that {S#} is not just a key but the “primary” key. In figures like Fig. 2.1, I’ll follow the convention of identifying primary key attributes by double underlining. • Finally, relations can be operated on by a variety of relational operators. In general, a relational operator is an operator that takes zero or more relations as input and produces a relation as output. Examples include the well-known operators restrict, project, join, and so on. By the way, the—very important!—fact that the output from any given relational operation is another relation is referred to as the closure property (the relational closure property, to be more precise, because other kinds of closure are also discussed in the literature). It’s the relational closure property that, among other things, allows us to write nested relational expressions.


Go Faster!


Perhaps I should say a little more about the relational operators. I assume you’re already somewhat familiar with the restrict, project, and join operators in particular; but if you aren’t, then at least I’ll be showing examples of their use at various points in subsequent chapters, and those examples should be sufficient to illustrate the general idea in each case. Nonetheless, a few words of explanation might be helpful here. In outline: • The restrict operator takes a single relation as input and returns all tuples of that relation that satisfy a specified condition. Here’s an SQL example: SELECT S.S#, S.SNAME, S.STATUS, S.CITY FROM S

WHERE S.CITY = ‘London’ ; The result is the derived relation shown a few paragraphs back3 (the SELECT statement in this example is, of course, an SQL formulation of the query “Get suppliers in London” discussed earlier, when derived relations were first mentioned). • The project operator takes a single relation as input and returns all (sub)tuples that remain after specified attributes have been removed. Here’s an SQL example: SELECT S.S#, S.CITY FROM S ;

Here’s the result:

• The join operator takes two relations as input (in which common attributes—that is, attributes with the same name—must be of the same type) and returns all “combined” tuples such that: a) Each such combined tuple involves all of the attributes of the two given relations (and no other attributes); and


Go Faster!


b) The combined tuple in question includes a tuple from each of the two relations as a subtuple. Note that the two subtuples necessarily have common values for common attributes, by definition. Here’s an SQL example: SELECT FIRST.S# AS X#, SECOND.S# AS Y# FROM S AS FIRST, S AS SECOND

WHERE FIRST.CITY = SECOND.CITY ; In this particular example, the two relations being joined are in fact one and the same. What’s more, the example doesn’t just involve a join—it does involve a join, of course, but then it takes a projection of the result of that join over two attributes, and then it renames those two attributes. The final result looks like this:

Note: We could tidy up this result by extending the WHERE clause in the original SQL formulation to include the specification “AND FIRST.S# < SECOND.S#” (immediately before the semicolon). The result would then look like this:

If you want to learn more about the relational model, and more about the relational operators in particular, please refer to the tutorial treatment in reference [33]. A more formal treatment can be found in reference [40]. Reference [35] might also be helpful.


Go Faster!


Incidentally, you’re probably well aware that, in SQL, relations, tuples, and attributes are called tables, rows, and columns, respectively. You might also find these terms less intimidating (more user-friendly) than relations, tuples, and attributes. As I’ve tried to show, however, there are good reasons (“a relation is not a table,” etc.) for using the more formal terms. In fact, I want to come back later and say more about this question of terminology—in particular, I want to explain in more detail just why I want to use the more formal terms rather than the SQL ones—and I’ll do that in Chapter 3. So much for relational terminology and concepts (for now, at any rate). In later sections of this chapter, I’ll use the suppliers relation of Fig. 2.1 as a basis for showing how relations, and operations on relations, are typically implemented in today’s SQL products. First, however, I need to say a little more about the concept of ordering.

2.2 Ordering As explained in the previous section, the relational model has no concept of ordering—neither left-to-right attribute ordering, nor top-to-bottom tuple ordering. Of course, these omissions are deliberate; there are many reasons why it would be a bad idea to include any concept of ordering at the relational level (see, e.g., reference [33] if you want further explanation). However, ordering is of considerable importance, for a variety of pragmatic reasons, at other (nonrelational) levels of the system. To be specific, it’s important at the presentation, internal, and hardware levels. To elaborate:

Challenge the way we run

EXPERIENCE THE POWER OF FULL ENGAGEMENT… RUN FASTER. RUN LONGER.. RUN EASIER… 1349906_A6_4+0.indd 1

READ MORE & PRE-ORDER TODAY WWW.GAITEYE.COM


22-08-2014 12:56:57


Go Faster!


• At the presentation level, users usually like to see certain top-to-bottom and left-to-right orderings when results are extracted from the system and displayed or printed. In fact, of course, it’s virtually impossible to display or print results without such orderings, owing to the inherent nature of display and print media; by way of an example, consider Fig. 2.1! But, of course, human users usually want the orderings in question to be meaningful ones, not just arbitrary—again, see Fig. 2.1 for an example—because such orderings can help the user to grasp more readily the real import of what’s being displayed or printed.4 That’s why relational systems typically provide an ORDER BY operator, whose purpose is to arrange the tuples of some specified relation into some specified top-to-bottom sequence. For example, Fig. 2.1 might represent the result—the ordered result, that is, with top-to-bottom tuple ordering—from the following SQL query: SELECT S.S#, S.SNAME, S.STATUS, S.CITY FROM S

ORDER BY S# ; Points arising: • Note very carefully that although the input to ORDER BY is a relation, the output isn’t; by definition, the output is ordered, and relations aren’t. (The output can be thought of instead as a list—or a sequence, or a vector, or a one-dimensional array—of tuples.) Thus, when I say that Fig. 2.1 might represent the result of the foregoing query, what I mean is that it might be interpreted as representing that result, instead of (as previously) being interpreted as representing the suppliers relation as such. In other words, while ORDER BY is certainly a respectable and useful operator, it isn’t a relational operator as such, and it isn’t part of the relational model. • In SQL in particular, the ORDER BY clause orders the tuples top to bottom, while the SELECT clause orders the attributes left to right. Thus, we might further interpret Fig. 2.1 as representing the result of the foregoing SQL query with its specified top-to-bottom tuple ordering and with its specified left-toright attribute ordering. • At the internal level, many implementation algorithms rely on the ability to access the stored data in some specific sequence. For example, it’s well known that—at least in a conventional system—sort/merge is one of the most efficient ways of implementing the relational join operation [59]. That is, a good way to evaluate the relational expression A JOIN B, where the join is to be done on the basis of values of some common attribute C, is as follows: 1. Sort the stored version of the tuples of relation A into sequence according to values of the common attribute C. Call the result List 1. 2. Do the same for relation B, calling the result List 2.


Go Faster!


3. Combine (“merge”) entries from List 1 and List 2 that have the same value for the common attribute C. This process can be done in a single pass over each of the two lists (that’s why it’s efficient). Clearly, this algorithm can be made considerably more efficient if the stored versions of A and B are in the desired sequence already, because then the two sorts won’t be necessary. • At the hardware level, it’s in the nature of essentially all physical storage media—from main memory to disks and tapes and all the way up and down the storage hierarchy—that there’s an inherent ordering to the way the medium is physically accessed. Clearly, therefore, it’s desirable that the ordering in question be put to some good use; that is, we’d like to store the data in such a way that the sequence in which it has to be physically accessed corresponds to the sequence in which the implementation most often needs it. From all of the above, it follows that there are a couple of important implementation challenges to be faced. First, at the hardware level, there’s clearly only one physical sequence available, so we definitely want to make the best use of it we can; what’s the best way to exploit the single physical ordering that’s available to us? Second, both users and the implementation often need to see the data in an ordering other than that single physical one; how can we impose additional orderings on the stored data, over and above that physical one?

2.3 Indexing Note: This section and the next three are based on material that originally appeared in reference [26], pages 724-743, copyright (c) 1995 Addison Wesley Longman Inc. The material is reused here by permission of Pearson Education Inc. Now let’s get back to the suppliers relation of Fig. 2.1. For the rest of the chapter, I’m going to assume—as SQL systems typically do assume—that (a) each tuple of that relation maps to its own stored record at the internal level, and (b) that stored record looks very much like the corresponding tuple; in other words, I’m assuming what in Chapter 1 I called a direct-image style of implementation. I’m also going to assume that a) Those stored records together constitute a stored file,5 and b) That stored file is physically stored in supplier number sequence, as suggested by Fig. 2.1 (first supplier S1, then supplier S2, and so on). Thus, we can now reinterpret Fig. 2.1, no longer as a picture of a relation as such, but rather as a picture of a possible stored representation of that relation. Fig. 2.2 is a revised version of Fig. 2.1 that highlights significant aspects of that reinterpretation; note in particular that it emphasizes the top-to-bottom sequence of records (in the next chapter, we’ll be concerned with the left-to-right sequence of fields, too). Note also that there’s no double underlining in Fig. 2.2; that’s because double underlining is used to indicate primary keys, and primary keys are a relational concept, not a stored-file one. Of course, we could define an analogous concept for stored files too, but keeping it as a purely relational notion helps to distinguish pictures of relations like Fig. 2.1 from pictures of stored files like Fig. 2.2.


Go Faster!


Fig. 2.2: Direct-image representation of suppliers

The question we now need to consider, then, is this: How can we impose additional orderings on the stored file illustrated in Fig. 2.2—for example, an ordering that lets us access suppliers in city name sequence (Athens, then London, then Paris)? The historical answer to this question has always been to introduce certain auxiliary storage structures, whose purpose is precisely to represent such additional orderings and to provide what are called alternative access paths to the data. Indexes are probably the best-known example of such a structure. In the case at hand, for example, we might create a city index as shown in Fig. 2.3.

This e-book is made with

SetaPDF

SETASIGN

PDF components for PHP developers

www.setasign.com 39 Download free eBooks at bookboon.com


Go Faster!


Fig. 2.3: Suppliers with a city index

Observe now that, as the figure suggests, the city index itself constitutes another stored file, and its records—namely, the three index entries—are physically stored in (let’s assume) city name sequence. Each index entry contains a city name, together with pointers to all of the stored records for suppliers located in the city in question. Thus, if the user issues the following SQL query— SELECT S.S#, S.SNAME, S.STATUS, S.CITY FROM S

ORDER BY CITY ; —then the system can obtain the desired result in the desired sequence by carrying out a sequential search on the city index and following each pointer in turn to the corresponding record in the suppliers file. Perhaps I should digress for a moment to explain what I mean by the term pointer. Basically, a pointer variable is a variable whose permitted values are pointer values, and a pointer value is, conceptually, the address of some location in storage—though it doesn’t have to be a physical address, of course, and in fact it usually isn’t, especially if the storage we’re talking about is secondary storage specifically (see the remarks below regarding logical disk addresses). Note: From this point forward, I’ll use the unqualified term pointer, sloppily but conveniently, to mean either a pointer variable or a pointer value, depending on context. Now, it should be clear that tangles of arrows like those in Fig. 2.3 can make such figures rather messy and difficult to follow. For that reason, from this point forward I’ll favor a different style: I usually won’t try to show pointers as arrows— instead, I’ll show the corresponding pointer values. Fig. 2.4 is a revised version of Fig. 2.3 that takes this latter approach. Observe that the supplier records have been given sequential record numbers, and those record numbers have then been used as references to those records—that is, as appropriate pointer values—within the city index. (By the way, the index entries are stored records too, as we already know, and they therefore have record numbers of their own. However, I haven’t tried to show those in Fig. 2.4.)


Go Faster!


Fig. 2.4: Fig. 2.3 revised to show pointer values (record numbers)

Of course, what I’m here calling “record numbers” won’t really be simple sequential numbers, as such, in any real implementation; instead, they’ll be some kind of logical disk address (assuming the records are kept in secondary storage)—for example, a page and offset address [26], or some other kind of address that doesn’t change if the records are physically moved around in storage. (Sequential numbers would clearly be a problem if, for example, a new record were inserted somewhere in the middle.) Despite this fact, however, I’ll continue to show sequential numbers in figures like Fig. 2.4, for obvious reasons of simplicity. I should say too that those record numbers are indeed, as suggested, best thought of as addresses, not as part of the records they refer to. Certainly they aren’t physically stored as part of the records they refer to, which is why Fig. 2.4 shows them separately, alongside but separate from the suppliers file itself. Let’s get back to the question of indexing. In general, of course, a given stored file can have any number of associated indexes; in the case of suppliers, for example, we might have a status index as well as the city index just discussed. We might even have an index on everything—a supplier number index, a supplier name index, a status index, and a city index—in which case the suppliers file would be said to be “fully inverted” [25,26]. (In fact, we might go further still, using a technique called combined indexes [56,61,64], but to discuss that idea in detail would take us too far beyond the scope of the present discussion.) Observe next that, as you probably know, indexes can be used to provide direct as well as sequential access to the indexed file. For example, consider the query “Get suppliers in London” (which involves a relational restrict operation, recall). If there’s a city index, then the system now has two strategies available to it for implementing that query: • Sequential search: Do a sequential search on the suppliers file, looking for all records with city name equal to London. • Direct lookup via the index: Do a sequential search on the city index, looking for the London entry, and then follow the pointers to the corresponding records in the suppliers file.


Go Faster!


But this fact—the fact, that is, that there are now two strategies—makes life harder for the implementation. The two strategies will clearly have different performance characteristics. For example, if the ratio of London suppliers to others is small, then direct lookup via the index will probably be more efficient; by contrast, if most suppliers are in fact in London, then sequential search will probably be more efficient. So the implementation now needs to include a new component, the optimizer, whose job among other things is to select the appropriate access path to use in implementing any given query. Now, it’s easy to see that access path selection has the potential to be a seriously complex business. For example, in the case at hand, does the optimizer know—and if so, how does it know—what the ratio of London suppliers to others is? If it does know, then how does it pick a threshold ratio for deciding between the two strategies? If the user asks to see suppliers in status sequence instead of city name sequence and there’s no status index, is it better to sort the data dynamically or to build a new status index on the fly? If the user asks to see suppliers in London with status 20, and if city and status indexes both exist, which one should be used? Perhaps both? Perhaps neither? And so on and so forth. Perhaps you can begin to understand why I said in Chapter 1 that indexes and other auxiliary structures can lead to DBMS implementation complexity. In fact, I said such structures can lead to a variety of other problems as well as complexity. Let’s take a quick look at some of the others. • Stored data redundancy: This one’s obvious. For example, there’s clearly more repetition of city names in Fig. 2.4 than there was in Fig. 2.2. Think what has to be done if a city changes its name—New Amsterdam becomes New York, Peking becomes Beijing, Bombay becomes Mumbai, St. Petersburg becomes Leningrad and vice versa, and so on and so forth. Of course, changing a city name is a problem in Fig. 2.2 too, but the problem is considerably worse in Fig. 2.4 (remember that the city index has to be physically kept in city name sequence). • Additional storage space requirements: This one’s obvious, too; again, compare Figs. 2.4 and 2.2. A heavily indexed file can easily occupy several times as much storage as the raw data by itself. • Physical database design complications: How do we decide what indexes should exist? Who decides? Can the system itself give us any guidance? What about query vs. update tradeoffs (see further discussion below)? And what about tuning the system for performance (for example, adding and dropping indexes) as and when performance requirements change? By the way, note the implied need here for performance monitoring tools and performance tuning “knobs” as well—meaning more implementation complexity and more database administration headaches. • Logical database design complications: In principle, indexes and the like are a physical design consideration only and should have no impact on logical design. Precisely because of the direct-image approach to implementation adopted in most of today’s systems, however, physical design considerations have a habit in practice of reflecting back on the logical design (see the discussion of “data independence” at the very end of this chapter).


Go Faster!


• Query inefficiencies and overheads: The inefficiencies occur if the optimizer does a less than perfect job in the access path selection process (which in practice it probably will). The overheads occur in carrying out that access path selection process in the first place. • Update inefficiencies and overheads: The very same inefficiencies and overheads that occur with queries clearly occur with updates as well. However, there’s another, perhaps more important, consideration to be taken into account in connection with updates. To be specific, while an index might perhaps be used to speed up queries, it will at the same time slow down updates. For example, every time a new supplier is inserted (meaning a new stored record has to be added to the suppliers file), a new entry also has to be added to the city index. By way of another example, consider what the system has to do to the city index of Fig. 2.4 if supplier S2 moves from Paris to Rome. There’s one more point—an important one—to be made in connection with the foregoing list of problems. As we’ve seen, indexes can be used to impose different orderings on a given stored file and thus (in a sense) “level the playing field” with respect to different processing sequences; all of those sequences are equally good from a logical point of view. But they certainly aren’t equally good from a performance point of view. For example, even if there’s a city index, processing suppliers in city name sequence will involve, in effect, random accesses to storage, precisely because the supplier records aren’t physically stored in city name sequence but are scattered all over the disk. In fact, at most one index on a given stored file can have a logical sequence that matches the physical sequence in which the records of that file are stored. (In the case of the suppliers file, given our assumption that the physical sequence is by S# value, that index would have to be a supplier number index specifically.) That special index, if it exists, is said to be a clustering index; sequential access via a clustering index is reasonably efficient, because the logical sequence defined by the index—which corresponds to the physical sequence of the index itself—reflects the physical sequence of the indexed file as well. By the way, you might be thinking, in the case of suppliers specifically, that there wouldn’t be much point in having a supplier number index. Why not? Because such an index would apparently have the same number of entries as the indexed file has records (since supplier numbers are unique), and searching that index would thus be just as slow as searching the indexed file itself. In practice, however, index entries usually point, not as I’ve been pretending to individual records in the indexed file, but rather to blocks (or pages) of records in that file. As a consequence, a supplier number index might indeed make sense after all. In fact, today’s SQL products almost certainly would have such an index—in part because such an index is typically used as the means of enforcing the required uniqueness constraint. Let me close this section by noting that, of course, what I’ve said so far has merely scratched the surface of the topic of indexing in general. The index structures found in real DBMSs are usually much more sophisticated than those I’ve been describing. For example, some products support join indexes [68], which are indexes that index two stored files at the same time and are specifically intended to improve the performance of certain join operations. Likewise, some products support bitmap indexes, which are indexes that are specifically designed to improve the performance of certain other retrieval operations.6 In other words, numerous refinements on the basic indexing idea have been developed over the years. However, all of those refinements suffer from the same basic problems, pretty much; thus, the discussions above should be sufficient to illustrate the idea of indexing in general and to show what some of those problems are.


Go Faster!


Note: Remarks analogous to those of the foregoing paragraph apply to the topics of the next three sections as well, as you’d probably expect. If you’d like to read more about those topics (or about indexing, of course), or about a variety of related topics that are beyond the scope of this book, then I recommend any of references [50], [55], or [69].

2.4

Pointer Chains

Pointer chains are another auxiliary structure that can be used to impose additional orderings on a given stored file.7 For example, suppose again, as in the previous section, that we want to be able to access suppliers in city name sequence. Then we might create the pointer chain structure shown in Fig. 2.5.

360° thinking

Fig. 2.5: Suppliers with a city parent file

360° thinking

.

.

360° thinking

.






Discover the truth44at www.deloitte.ca/careers Click on the ad to read more Download free eBooks at bookboon.com


D

Go Faster!


As you can see, there are two stored files in the figure, a suppliers file and a city file, much as there were in the indexing example in Figs. 2.3 and 2.4. This time, however, the city file is not an index but what is sometimes called a parent file; the suppliers file is accordingly called a child file, and the overall structure is an example of parent/child organization. The parent file—which is stored (let’s assume) in city name sequence—contains one stored record for each distinct city, giving the city name and acting as the head of a chain of pointers connecting together all of the child records for suppliers in that city. For example, the parent record for Paris contains the pointer value c2; child record c2 (the one for supplier S2) contains the pointer value c3; and child record c3 (the one for supplier S3) contains the pointer value p3, which takes us back to the parent record for Paris again. Note that the city names have been removed from the suppliers file, thereby eliminating some data redundancy. If the user now issues the same SQL query as in the previous section— SELECT S.S#, S.SNAME, S.STATUS, S.CITY FROM S

ORDER BY CITY ; —then the system can obtain the desired result in the desired order by doing a sequential search on the city file and, for each record in that file in turn, following the pointer chain to the corresponding records in the suppliers file. In general, of course, a given stored file can have any number of associated parent files and pointer chains, just as it can have any number of indexes. For example, we might have a status parent file for suppliers as well as the city parent file just discussed. What’s more, pointer chain structures can be used to provide direct access as well as sequential access to the child file, again as with indexes. For example, to find all suppliers in London (a relational restrict operation again), the system can do either of the following: • Sequential search: Do a sequential search on the suppliers file, and, for each supplier in turn, follow the pointer chain to the city file to find out whether the supplier is in London. • Direct lookup via the parent file: Do a sequential search on the city file, looking for the London entry, and then follow the pointer chain to find all corresponding records in the suppliers file. In general, pointer chain structures have advantages and disadvantages analogous to, though not the same as, those that apply to index structures. It’s not worth going into those advantages and disadvantages in detail here; let me just make a few pertinent observations. • The principal advantage of pointer chains over indexes is that the insert/delete algorithms are somewhat simpler [26]. Also, the parent/child structure in the example is less redundant than, and will probably occupy less storage space than, the corresponding index structure, because each city name appears exactly once instead of several times.


Go Faster!


• The principal disadvantages are as follows: • For a given city, the only way to access the Nth supplier in that city is to follow the chain and access the 1st, 2nd, ..., (N‑1)st supplier too. • Processing suppliers in city name sequence will still typically involve random accesses to storage, unless the supplier records happen to be physically stored (“clustered”) in city name sequence, which in practice they probably won’t be. • Although the pointer chain structure might help with the query “Get the suppliers in a given city,” it’s of no help—in fact, it’s a positive hindrance—for the inverse query “Get the city for a given supplier” (contrast the situation with indexes). • Imposing a new pointer chain structure on an existing stored file is a highly nontrivial matter (partly because the pointer chains actually run through the stored records); in fact, such an operation will typically require a database reorganization.8 By contrast, it’s a comparatively straightforward matter to create a new index over an existing stored file.

2.5 Hashing After indexes, the auxiliary structures most widely used in practice are probably hash structures. Such structures are very good for direct access but typically don’t support sequential access at all; in fact, hash structures, unlike the index and pointer chain structures described in previous sections, aren’t usually thought of as a way of imposing ordering as such (unless the hash function—see later—is “order-preserving” [51], which in practice it usually isn’t). But hashing is important, and it deserves a brief discussion here. Hashing is a technique for providing fast access to a specific stored record on the basis of a given value within that record. It works as follows (in outline): • Each stored record is placed in storage at a location whose address is computed as some function (the hash function) of some specific value contained within that record. The computed address is called the hash address. • To store the record initially, the system computes the hash address for the new record and stores the record at that location. To access the record subsequently, the system performs the same computation as before and goes to the record at the computed address. By way of a simple example, suppose we have supplier records for suppliers S100, S200, S300, S400, S500 (instead of S1, S2, S3, S4, S5, respectively), and consider the following hash function: hash address = MOD ( numeric part of S# value, 13 ) + 1


Go Faster!


(where MOD stands for “modulus”; the expression MOD(a,b) returns the remainder after dividing a by b). This is a trivial example of a very common class of hash function called division/remainder. (For reasons beyond the scope of this book, the divisor in a division/remainder hash is usually chosen to be prime, as in our example.) The hash addresses for our five suppliers are thus 10, 6, 2, 11, and 7, respectively, giving us the hash structure shown in Fig. 2.6.

NY026057B

TMP PRODUCTION

4

6x4

12/13/2013

ACCCTR0

PSTANKIE

gl/rv/rv/baf

Bookboon Ad Creative

Fig. 2.6: Suppliers with a supplier number hash


© 2013 Accenture.




Go Faster!


In addition to showing how hashing works, the example also shows why the hash function is necessary. It would theoretically be possible to use an identity hash function, thereby storing the record for supplier S100 at location 100, the record for supplier S200 at location 200, and so on. Such a technique would generally be inadequate in practice, however, because the range of values to be hashed will usually be much greater than the range of available addresses. For instance, suppose supplier numbers are in fact in the range S000‑S999, as in the example; then there would be 1000 possible distinct supplier numbers, whereas there might in fact be only ten or so actual suppliers at any given time. In order to avoid a considerable waste of storage space, therefore, it would be nice to find a hash function that will map any value in the range 000‑999 to one in the range 1‑10 (say). To allow room for growth, it’s usual to extend the target range by 25 percent or so; that’s why I chose a function in the example that generated values in the range 1‑13 instead of 1‑10. The example also shows clearly why, as mentioned earlier, hashing typically provides no support for sequential access. Indeed, the sequence of records within the hash structure will almost certainly not have any sensible logical interpretation (again, unless the hash function is order-preserving).9 What’s more, there will typically be gaps of arbitrary size between consecutive stored records. Another disadvantage of hashing in general is the possibility of collisions—that is, two or more distinct records might hash to (“collide at”) the same hash address. For example, suppose the suppliers file (with suppliers S100, S200, etc.) also includes a supplier with supplier number S139. Given the hash function “divide by 13 and add one” as discussed above, that supplier will collide at hash address 10 with supplier S100. The hash function as it stands is thus clearly inadequate— it needs to be extended somehow to deal with the collision problem. One way to do this is to treat the remainder after division by 13, not as the hash address as such, but rather as the start point for a sequential search to find that address. Thus, to insert supplier S139 (assuming that suppliers S100‑S500 already exist), we go to hash address 10 and search forward from that position for the first free location. The new supplier is stored at hash address 12. To access that supplier subsequently, we go through a similar procedure.

2.6

Data Compression

There’s one last storage issue I’d like to address briefly. We’ve seen that one problem with auxiliary structures such as indexes is that they increase storage space requirements. By contrast, data compression techniques are ways of reducing storage space requirements. In fact, such techniques can also, and perhaps more significantly, reduce the amount of I/O—for if the data occupies less storage, then fewer I/O operations will be needed to access it. (On the other hand, extra processing cycles might be needed to decompress the data after it has been retrieved, and that extra processing could have the effect of slowing down queries and updates, thereby negating the benefits of reducing I/O in the first place.) In general, compression techniques are designed to exploit the fact that data values are almost never completely random but instead display some degree of predictability. As a trivial example, if a given person’s name in a name-and-address file starts with the letter “R”, then it’s extremely likely that the next person’s name will start with the letter “R” as well (assuming, of course, that the file is kept in alphabetical order by name).


Go Faster!


One simple compression technique is thus to replace each individual data value by some representation of the difference between it and the value that immediately precedes it: differential compression. Note immediately, however, that this technique requires that the data be accessed sequentially, because to decompress any given stored value requires knowledge of the immediately preceding stored value. Differential compression thus has its main applicability in situations in which the data must be accessed sequentially anyway, as in the case of (for example) the entries in an index. Note in particular that—in the case of a clustering index specifically—the pointers can be compressed as well as the data; this is because, if the logical ordering imposed by the index is the same as (or close to) the physical ordering of the underlying stored file, then successive pointer values in the index will be quite similar to one another, and pointer compression is likely to be beneficial. In fact, indexes almost always stand to gain from the use of compression, at least for the data if not for the pointers. To illustrate differential compression, let’s forget suppliers for a moment and consider an index on, say, employee names. Suppose, realistically enough, that the index entries are grouped into blocks or pages, and suppose the first four entries on some particular page are for the following employees: Roberton

Robertson

Robertstone Robinson

Suppose also that the employee name field in the indexed file is 12 bytes wide, so that each of these names must be considered, in its uncompressed form, to be padded at the right with an appropriate number of blanks. Then one way to apply differential compression to this set of values is to replace those characters at the front of each entry that are the same as those in the previous entry on the same page by a corresponding count: front compression. Here’s the result: 0 - Roberton++++ 6 - son+++ 7 - tone+

3 - inson++++ (trailing blanks now shown explicitly as “+”). If the counts occupy one byte each, the space requirements for these four values have been reduced from 48 to 36 bytes—a 25 percent reduction. Another possible compression technique for this set of data is simply to eliminate the trailing blanks entirely (again, replacing them by an appropriate count): rear compression. Result: 0 - 8 - Roberton 6 - 3 - son

7 - 4 - tone

3 - 5 - inson


Go Faster!


The first of the two counts in each entry here is as in the previous example, the second is a count of the number of characters recorded (not a count of the number of omitted blanks, observe). The space requirement is now just 28 bytes, a reduction of nearly 42 percent compared to the original. Further rear compression can be achieved by dropping all characters to the right of the one required to distinguish the entry from its immediate neighbor(s), thus: 0 - 7 - Roberto 6 - 2 - so 7 - 1 - t 3 - 1 - i (I’m assuming the first four characters of the next entry when decompressed aren’t “Robi”). The space requirement is now just 19 bytes, a reduction of over 60 percent. Note, however, that now we’ve actually lost some information—that is, when decompressed, the four entries look like this: Roberto????? Robertso???? Robertst???? Robi???????? (where “?” represents an unknown character). Such loss of information is obviously permissible only if the data is recorded in full somewhere (in the example, it’s recorded in the corresponding indexed file).

2.7

Concluding Remarks

So much for our quick survey of the direct-image approach to implementation, and in particular of some of the problems it seems to drag along in its wake. Surely we can do better than this. Indeed, relational advocates have claimed for years that the relational model doesn’t have to be implemented this way; that’s the whole point—or, at least, a large part of the point—of that clean “model vs. implementation” distinction I was emphasizing so much in Chapter 1. In fact, Codd himself, in his famous 1970 paper on the relational model [6], had this to say: [Although] implementation problems are [not] discussed ... the material presented should be adequate for experienced systems programmers to visualize several approaches. —E. F. Codd The sad fact is, however, that the mainstream SQL vendors, at least, didn’t do a very good job of that visualization. As a result, the idea that the model and implementation levels are supposed to be distinct, although a fundamental feature of relational products in theory, has very largely been overlooked in practice.


Go Faster!


Incidentally, please note that I don’t mean to imply by the foregoing remarks that nobody “did a good job of that visualization.” Certain early prototypes and research proposals did depart—somewhat, though nothing like as far as the TR model does— from the direct-image style I’ve been describing (but the prototypes and proposals in question unfortunately had little or no commercial impact). I also don’t mean to imply that I think all of the mainstream SQL products are implemented in exactly the same way; clearly, each of those products has its own internal architecture, unique to that product alone. But I think it’s fair to say that, at least to a first approximation, the products do all store the data in the same kind of direct-image way. As a consequence, they all suffer from the kinds of deficiencies this chapter has been concerned with. Data Independence I’d like to close with a few more remarks on this business of clearly distinguishing between the model and its implementation. As you’ve probably realized, what we’re really talking about here is the concept usually known as data independence: physical data independence, to be precise.10 The basic idea behind physical data independence is that we should be able to make changes (for performance reasons in particular) to the way the data is physically stored and accessed, without having to make any corresponding changes in what that data looks like to the user, and hence without having to make any corresponding changes in the source code of applications that use that data. And while it’s true that data independence isn’t an absolute—different systems provide it to different degrees, and few if any provide none at all—it’s also true that systems that adopt a direct-image style of implementation provide far less data independence than relational systems are theoretically capable of. We need to fix this problem.





Go Faster!


Endnotes 1. Throughout this book, I choose to overlook the fact that “#” is not included in the SQL standard character set and thus cannot be used in what the standard calls a “regular identifier” [39]. 2. It might be thought of as an abstraction of a table, with tuples instead of rows and attributes instead of columns. 3. Strictly speaking, this sentence is incorrect, because “tuples” in SQL have a left-to-right ordering to their components and so aren’t true tuples, and the result of the SQL query is thus not a true relation. In what follows, I’ll usually ignore this point. 4. As Keith Devlin says in his book The Math Gene [46]: “The human mind is a pattern recognizer ... The ability to see patterns and similarities is one of [its] greatest strengths” (page 60). 5. I pointed out in Chapter 1 that this assumption isn’t always valid (the correspondence between relations and stored files isn’t always one-to-one), but this fact doesn’t materially affect the arguments of the present chapter. I make the assumption merely for reasons of simplicity and definiteness. 6. Actually, bitmap indexing is not much like conventional indexing at the detail level. A detailed explanation would be out of place here; suffice it to say that bitmap indexing is nonetheless still indexing, albeit of a somewhat novel kind, and it suffers from the same kinds of problems as conventional indexing does. If you’re interested and want to learn more about it, you can find informal explanations in references [49] and [50]. 7. Pointer chains were used very heavily in preSQL systems, especially in CODASYL systems such as IDMS [14,25]; few of today’s SQL systems use them. I should add, however, that those early systems didn’t just use pointer chains, they typically exposed them to the user (with significant negative consequences), precisely because—as noted in Chapter 1—those systems failed to make a clear distinction between model and implementation. I’ll have more to say on this topic in Chapter 5 (Section 5.2). 8. This is probably why pointer chains aren’t much used in today’s SQL systems. 9. Of course, we can always impose any logical sequence we want on a given hash structure by means of a suitable index—indeed, we could impose several such sequences, by means of several indexes. Or we could use pointer chains. But these possibilities all introduce further complications, as we already know. 10. For a discussion of the distinction between physical and logical data independence, see reference [32]. Throughout the remainder of this book, I’ll take the unqualified term data independence to mean physical data independence specifically.


Go Faster!


3 Three Levels of Abstraction 3.1 Introduction In order to understand the TR approach to implementing the relational model, it’s necessary to be very clear over three distinct levels of the system, which I’ll refer to as the three levels of abstraction (since each level is an abstraction of the one below, loosely speaking). The three levels, or layers, are: 1. The relational (or user) level 2. The file level 3. The TR level They’re illustrated in Fig. 3.1. In a nutshell:

Fig. 3.1: The three levels of abstraction

• Level 1, which corresponds to the database as seen by the user, is the relational level. At this level, the data is perceived as relations, including, perhaps, the suppliers relation S discussed in Section 2.1 (and illustrated in Fig. 2.1) in the previous chapter. • Level 3 is the fundamental TR implementation level. At this level, data is represented by means of a variety of internal structures called tables. Please note immediately that those TR tables are NOT tables in the SQL sense and do NOT correspond directly to relations at the user level.


Go Faster!


• Level 2 is a level of indirection between the other two. Relations at the user or relational level are mapped to files at this level, and those files are then mapped to tables at the TR level. Of course, the mappings go both ways; that is, tables at the TR level map to files at the next level up, and those files then map to relations at the top level. Note: As I’m sure you know, map is a synonym for transform (and I’ll be using the term in that sense throughout this book); thus, we’re already beginning to touch on the TR transforms that were mentioned in Chapter 1. However, there’s a great deal more to it, as we’ll soon see. Please now observe that each level has its own terminology: relational terms at the user level, file terms at the file level, and table terms at the TR level. Using different terms should, I hope, help you keep the three levels distinct and separate in your mind; for that reason, I plan to use the three sets of terms consistently and systematically throughout the rest of this book. Having said that, I now need to say too that I’m well aware that some readers might object to my choice of terms—perhaps even find them confusing—for at least the following two reasons: • First, the industry typically uses the terminology of tables, not relations, at the user level—almost exclusively so, in fact. But I’ve already explained some of my rationale for wanting to use relational terms at that level (see the previous chapter, Section 2.1), and I’m going to give some additional reasons in the next section.

30 FR da EE ys tria


l!

Go to

BrowserTexting.com


...


BrowserTexting


Go Faster!


• Second, the industry also typically tends to think of files as a fairly “physical” construct. In fact, I did the same thing myself in the previous chapter, somewhat, though I was careful in that chapter always to be quite clear that the files I was talking about were indeed physically stored files specifically. By contrast, the files I’ll be talking about in the rest of the book are not physically stored; instead, they’re an abstraction of what’s physically stored, and hence a “logical” construct, not a physical one. (Though it wouldn’t be wrong to think of them as “slightly more physical” than the user-level relations, if you like.) If you still think my terms are confusing, then I’m sorry, but for better or worse they’re the terms I’m going to use. One final point: When I talk of three levels, or layers, of abstraction, I don’t mean that each of those levels is physically materialized in any concrete sense—of course not. The relational level is only a way of looking at the file level, a way in which certain details are ignored (that’s what “level of abstraction” means). Likewise, the file level in turn is only a way of looking at the TR level. Come to that, the TR level in turn is only a way of looking at the bits and bytes that are physically stored; that is, the TR level is itself—as already noted in Chapter 1, Section 1.2—still somewhat abstract. In a sense, the bits-and-bytes level is the only level that’s physically materialized.1

3.2

The Relational Level

Since the focus of this book is on the use of TR technology to implement the relational model specifically, the topmost (user) level is relational by definition. In other words, the user sees the database as a set of relations, made up of attributes and tuples as explained in Chapter 2. For simplicity, I’m going to assume those relations are all base relations specifically (again, see Chapter 2); that is, I’ll simply assume, barring explicit statements to the contrary, that any relation that’s named and is included in the database is in fact a base relation specifically, and I won’t usually bother to use the “base” qualifier. Also, of course, the user at the relational level has available a set of relational operators—restrict, project, join, and so forth—for querying the relations in the database, as well as the usual INSERT, DELETE, and UPDATE operators for updating them. Note: If I wanted to be more precise here, I’d have to get into the important distinction between relation values and relation variables. Relational operators like join operate on relation values, while update operators like INSERT operate on relation variables. Informally, however, it’s usual to call them all just relations, and—somewhat against my better judgment—I’ve decided to follow that common usage (for the most part) in the present book. For further discussion of such matters, see either reference [32] or reference [40]. Now, given the current state of the IT industry, the user level in a real database system will almost certainly be based on SQL, not on the relational model. As a consequence, users will typically tend to think, not in terms of relational concepts as such, but rather in terms of SQL analogs of those concepts. For example, there isn’t any explicit project operator, as such, in SQL; instead, such an operation has to be formulated in terms of SQL’s SELECT and FROM operators, and the user has to think in terms of those SQL operators, as in this example (“Project suppliers over supplier number and city name”): SELECT S.S#, S.CITY FROM S ;


Go Faster!


Precisely because most of today’s database systems are in fact SQL systems specifically, I’ll show most of my examples in what follows in SQL, not in pure relational form. But I do still want to use the terms relation, tuple, and attribute at the user level (sometimes user relations, tuples, and attributes, for emphasis), instead of the more familiar SQL terms table, row, and column, and—as I promised I would, both in Chapter 2 and in the previous section—I’d like to give my reasons for adopting this perhaps rather purist or academic position. In essence, it seems to me that to use the SQL terms would lead to at least three problems: • First of all, we’re going to need to use the terminology of tables, rows, and columns—as we very often do when discussing software internals—at the implementation level (which is to say the TR level), and the TR and SQL constructs are, as already noted, completely different things. So there would be an obvious potential for confusion right away. • Second, the SQL terminology tends to obscure the crucial distinction alluded to above between relation values and relation variables. (SQL doesn’t clearly distinguish between these concepts at all, referring to them both simply as tables, a state of affairs that has demonstrably led to some confusion in the past.) • Third, considered as possible user-level terms, table, row, and column are in fact actively misleading (indeed, I wish we’d never used them, not even in SQL), for at least the following reasons: • They lend weight to the “duplicate tuples” heresy. In fact, an SQL table can have duplicate rows, although as we know a relation can’t have duplicate tuples. Note: The TR model itself doesn’t care whether there are duplicates or not, and hence can support SQL’s nonrelational tables as well as proper relations—but I don’t propose to discuss that nonrelational support in any detail in this book. • They suggest there’s a top-to-bottom ordering to the rows, though in fact there isn’t. • They suggest there’s a left-to-right ordering to the columns. (In fact there is, in SQL—another departure from the relational model, as noted in Chapter 1.) • They suggest that “row-and-column intersections” in those tables can be accessed via [i,j]-style subscripting, instead of associatively. That is, tables—but definitely not relations—are often thought of as being something like arrays (two-dimensional arrays, to be precise). Note: The term “associatively” refers to the fact that data at the relational level is accessed by value, not by address. For example, “Get tuples for suppliers in London” is a relational request, but “Get the first and fourth supplier tuples” isn’t. Likewise, “Get status values from tuples for suppliers in Paris” is a relational request, but “Get values of the third attribute from tuples for suppliers in Paris” isn’t. • Most significantly, they tend to obscure the important connections between the relational model and mathematics and logic. (Those connections are important because they’re what make it possible to treat database management as a science; without them, the field becomes a mere ragbag of ad hoc tricks, techniques, and rules of thumb.) This isn’t an exhaustive list.2


Go Faster!

3.3


The File Level

The first step, conceptually speaking, in mapping a given relation to an appropriate TR representation is to convert that relation into a file, with records corresponding to the tuples and fields corresponding to the attributes. For example, Fig. 3.2 shows a possible file corresponding—in a trivially obvious way—to the suppliers relation of Fig. 2.1 in Chapter 2.

Fig. 3.2: A file corresponding to the suppliers relation of Fig. 2.1

Within such a file, records do have a top-to-bottom ordering and fields do have a left-to-right ordering, as the record numbers and field numbers in the figure are meant to suggest.3 However, the orderings in question are essentially arbitrary; thus, for example, the suppliers relation of Fig. 2.1 could map equally well to any of 2,880 different files (120 different orderings for the five records and 24 different orderings for the four fields). By way of illustration, Fig. 3.3 shows another possible file corresponding to the suppliers relation of Fig. 2.1.

Brain power

By 2020, wind could provide one-tenth of our planet’s electricity needs. Already today, SKF’s innovative knowhow is crucial to running a large proportion of the world’s wind turbines. Up to 25 % of the generating costs relate to maintenance. These can be reduced dramatically thanks to our systems for on-line condition monitoring and automatic lubrication. We help make it more economical to create cleaner, cheaper energy out of thin air. By sharing our experience, expertise, and creativity, industries can boost performance beyond expectations. Therefore we need the best employees who can meet this challenge!

The Power of Knowledge Engineering

Plug into The Power of Knowledge Engineering. Visit us at www.skf.com/knowledge



Go Faster!


Fig. 3.3: Another file corresponding to the suppliers relation of Fig. 2.1

Of course, those 2,880 different files are all equivalent to one another, in the sense that they all represent exactly the same information; in other words, they’re all information-equivalent. It’s sometimes convenient, therefore, to regard them, not so much as 2,880 distinct files as such, but rather as 2,880 different versions of “the same” file.4 This perception will turn out to be important in the next chapter—also, especially, in Chapter 7. Files, records, and fields (sometimes user files, records, and fields for emphasis, since in many respects the file level is still quite close to the user or relational level) can be operated upon by obvious counterparts to the operators available at the relational level. Also, reconstructing the corresponding relation from a given file (any version) is trivial: Just ignore the orderings. Files such as those shown in Figs. 3.2 and 3.3 can now be represented by tables at the TR level and can be reconstructed from those TR tables. In fact (important!), many different versions of the same file can all be reconstructed from the same TR tables equally easily (using the term “versions” in the special sense explained above—that is, record and field orderings might be different, but content remains the same). We’ll see how this works out in the next chapter. One last point: I’ve called this level the file level and the next more specifically the TR level because most of the ingenuity, inventiveness, and novelty of the TR model is to be found at that next level. However, the file level too is part of the overall TR implementation approach, of course.

3.4

The TR Level

Files at the file level map to tables at the TR level, and those tables are made up of rows and columns. Like records and fields within files, rows in a TR table do have a top-to-bottom ordering and columns in such a table do have a left-toright ordering. And, very importantly, a row-and-column intersection within such a table, which I’ll refer to as a cell, can be addressed via [i,j]-style subscripting (where i is the row number and j is the column number); in other words, TR tables, unlike SQL tables, can legitimately, and usefully, be thought of as two-dimensional arrays. Cells in such a table or array contain values. What’s more, those values can sometimes be composite; for example, a given cell might contain an ordered pair of pointer values, and an ordered pair of values can certainly be regarded as a value—a composite value—in its own right.


Go Faster!


Now, the mapping of files to TR tables is quite a complex business, and I don’t want to start getting into details of how it’s done until the next chapter. Suffice it to say that it’s nothing like the direct-image kind of mapping discussed in Chapters 1 and 2. In particular, rows in TR tables do not correspond in any one-to-one kind of way to records at the file level, nor a fortiori do they correspond in any one-to-one kind of way to tuples at the relational level. By way of illustration, Fig. 3.4 shows a TR table, the Field Values Table, corresponding to the file of Fig. 3.2. As I’ve already indicated, I don’t want to get into details yet of just how that table is obtained from that file, but you might like to try to figure it out for yourself (it’s not very difficult). All I want to do now is draw your attention to the fact that indeed, as claimed, the rows don’t correspond in any obvious way to the records shown in Fig. 3.2.

Fig. 3.4: Field Values Table corresponding to the file of Fig. 3.2

In order to be able to reconstruct the file of Fig. 3.2 from the Field Values Table of Fig. 3.4, we need another table, the Record Reconstruction Table.5 Again, I don’t want to get into details yet of how the Record Reconstruction Table is obtained, nor how it’s used in the reconstruction process; I’ll just show, in Fig. 3.5, a possible Record Reconstruction Table corresponding to the file shown in Fig. 3.2 (and to the Field Values Table shown in Fig. 3.4)—and point out that the entries in the Record Reconstruction Table aren’t supplier numbers or status values, etc., any longer (despite the column labels) but are row numbers instead. For further explanation, see the next chapter.

Fig. 3.5: Record Reconstruction Table corresponding to file of Fig. 3.2 (and Field Values Table of Fig. 3.4)


Go Faster!


By the way, it’s a little misleading to talk (as I’ve just been doing) in terms of the Field Values Table and the Record Reconstruction Table, because there’ll probably be many such tables in any real implementation—one of each for each file at the file level, loosely speaking (but see Chapters 9 and 11‑14 later). However, it’s much easier to talk in terms of, for example, “the Field Values Table” instead of having to say something like “the particular Field Values Table that corresponds to the file of Fig. 3.2” every time we need to refer to such a thing. So I’ll continue to talk this way for most of the rest of this book, and hope you won’t find the practice confusing. I’ll be discussing what’s involved in building and using the Field Values Table and the Record Reconstruction Table in the next few chapters. For now, let me close by stressing a point I’ve made a couple of times already: namely, that the TR level, though obviously at a much lower level of detail than the relational level, is nevertheless still abstract. In fact, TR is a model in the sense of Chapter 1, meaning it can be regarded as a layer of abstraction over something deeper down. In particular, the TR tables discussed above, and their associated operators, can be physically implemented in a variety of different ways, some of which I’ll be talking about in later chapters. Very importantly, of course, they can be implemented in either main memory or secondary storage; indeed, they can be implemented on absolutely any hardware platform whatsoever, from a handheld or palmtop computer, to a laptop or desktop machine, to a mainframe, to a client/server or other distributed system, to the most massively parallel supercomputer. Now, this book is primarily concerned with the TR model as such, not so much with specific implementations of that model; however, Part III does specifically address the question of a disk-based implementation, since there are clearly special issues to be addressed in such an environment. By contrast, Part II doesn’t assume any particular implementation environment at all (at least, not explicitly); however, you can think of it for the most part as implicitly assuming a main-memory environment, if you find it helpful to do so.

Endnotes 1. Well ... to be pedantic about it, those bits and bytes are an abstraction too, of course, and so on, all the way down to the level of electrons (and beyond!). But bits and bytes are physical enough for our purposes. 2. In particular, I’d like to point out that certain very important relational “tables”—namely, the ones that references [12] and [24] call TABLE_DEE and TABLE_DUM—don’t have any “columns” anyway. (The analogy between relations and tables breaks down here.) Further discussion of this particular issue would take us much too far afield, however; if you’re intrigued and want to know more, see either reference [32] or reference [40]. 3. In practice, like the record numbers discussed in Chapter 2, those record and field numbers probably won’t be simple sequential numbers as shown in the figure. The same is true for row and column numbers at the TR level (see the next section). 4. Incidentally, note that those different versions can’t all be obtained by means of a simple ORDER BY. 5. I could logically have called this table the File Reconstruction Table, but I wanted to emphasize the point that it can be used to reconstruct individual records of the file as well as the file in its entirety. In fact, it can be used to reconstruct any subset of the records in that file, and any subset of the fields in those records, as we’ll see in the course of the next few chapters.


Go Faster!

Part II: The Transrelational Model


Go Faster!

Core Concepts

4 Core Concepts 4.1 Introduction Now (at last) I can begin to explain the TR model in detail. As I mentioned several times in Part I, TR is indeed still a model, and thus, like the relational model, still somewhat abstract. At the same time, however, it’s at a much lower level of abstraction than the relational model; it can be thought of as being closer to the physical implementation level (“closer to the metal”), and accordingly more oriented toward issues of performance. In particular, it relies heavily on the use of pointers—a concept deliberately excluded from the relational model, of course, for reasons discussed in references [9], [30], [40], and many other places—and its operators are much more procedural in nature than those of the relational model. (What I mean by this latter remark is that code that makes use of those operators is much more procedural than relational code is, or is supposed to be.) What’s more, reference [63] includes detailed, albeit still somewhat abstract, algorithms for implementing those operators. Note: These remarks aren’t meant to be taken as criticisms, of course; I’m just trying to capture the essence of the TR model by highlighting some of its key features. Despite its comparatively low-level nature, the fact remains that, to say it again, TR is indeed a model, and thus capable of many different physical realizations. In what follows, I’ll talk for much of the time in terms of just one possible realization—it’s easier on the reader to be concrete and definite—but I’ll also mention some alternative implementation schemes on occasion. Note that the alternatives in question have to do with the implementation of both data structures and corresponding access algorithms. In particular, bear in mind that both main-memory and secondary-storage implementations are possible. Now, this book is meant to be a tutorial; accordingly, I want to focus on showing the TR model in action (as it were)—that is, showing how it works in terms of concrete examples—rather than on describing the abstract model as such. Also, many TR features are optional, in the sense that they might or might not be present in any given implementation or application of the model, and it’s certainly not worth getting into all of those optional features in a book of this kind. Nor for the most part is it worth getting into the optionality or otherwise of those features that are discussed—though I should perhaps at least point out that options do imply a need for decisions: Given some particular option X, some agency, at some time, has to decide whether or not X should be exercised. For obvious reasons, I don’t want to get into a lot of detail on this issue here, either. Suffice it to say that I don’t think many of those decisions, if any at all, should have to be made at database design time (by some human being) or at run time (by the system itself); in fact, I would expect most of them to be made during the process of designing the DBMS that is the specific TR implementation in question. In other words, I don’t think the fact that those decisions do have to be made implies that a TR implementation will therefore suffer from the same kinds of problems that arise in connection with direct-image systems, as discussed in Chapter 2. It follows from all of the above that this book is meant as an introduction only; many topics are omitted and others are simplified, and I make no claims of completeness of any kind.


Go Faster!

Core Concepts

Now let’s get down to business. In this chapter and the next,1 we’ll be looking at what are clearly the most basic TR constructs of all: namely, the Field Values Table and the Record Reconstruction Table, both of which were mentioned briefly in the final section of the previous chapter. These two constructs are absolutely fundamental—everything else builds on them, and I recommend as strongly as I can that you familiarize yourself with their names and basic purpose before you read much further. Just to remind you: • The Field Values Table contains the field values from a given file, rearranged in a way to be explained in Section 4.3. • The Record Reconstruction Table contains information that allows records of the given file to be reconstructed from the Field Values Table, in a way to be explained in Section 4.4. In subsequent chapters I’ll consider various possible refinements of those core concepts. Note: Those refinements might be regarded in some respects as “optional extras” or “frills,” but some of them are very important—so much so, that they’ll almost certainly be included in any concrete realization of the TR model, as we’ll see.

4.2

The Crucial Idea

Let r be some given record within some given file at the file level. Then the crucial insight underlying the TR model can be characterized as follows: The stored form of r involves two logically distinct pieces, a set of field values and a set of “linkage” information that ties those field values together, and there’s a wide range of possibilities for physically storing each piece. In direct-image systems, the two pieces (the field values and the linkage information) are kept together, of course; in other words, the linkage information in such systems is represented by physical contiguity. In TR, by contrast, the two pieces are kept separate; to be specific, the field values are kept in the Field Values Table, and the linkage information is kept in the Record Reconstruction Table. That separation makes TR strikingly different from virtually all previous approaches to implementing the relational model (see Chapters 1 and 2), and is the fundamental source of the numerous benefits that TR technology is capable of providing. In particular, it means that TR data representations are categorically not a direct image of what the user sees at the relational level. Note: One immediate advantage of the separation is that the Field Values Table and the Record Reconstruction Table can both be physically stored in a way that is highly efficient in terms of storage space and access time requirements. However, we’ll see many additional advantages as well, both in this chapter and in subsequent ones.

4.3

The Field Values Table

Consider the file shown in Fig. 4.1. The figure is basically a repeat of Fig. 3.2, except that for the sake of the example I’ve rearranged the records into a different top-to-bottom sequence (after all, we know from Chapter 3 that record sequence at the file level is effectively arbitrary anyway; in fact, the same is true of left-to-right field sequence as well, but for simplicity I’ve kept that unchanged). Fig. 4.2, a repeat of Fig. 3.4, shows the corresponding Field Values Table.


Go Faster!

Core Concepts

Fig. 4.1: A file corresponding to the suppliers relation of Fig. 2.1

Fig. 4.2: Field Values Table corresponding to the file of Fig. 4.1

> Apply now redefine your future

- © Photononstop



19/12/13 16:36



Go Faster!

Core Concepts

Note: Together with Fig. 2.1, which shows the original suppliers relation, Figs. 4.1 and 4.2 form the basis for a running example that I’ll be using throughout this chapter (and indeed throughout the next two chapters as well). You might want to keep a copy of those figures by you for ease of subsequent reference. Now, you’ve probably figured out for yourself how the Field Values Table is obtained from the corresponding file: Basically, each column of the table contains the values from the corresponding field of the file, rearranged into ascending sort order. Note immediately, therefore, that no matter what order the records of the file appear in initially, we wind up with the same Field Values Table; that’s why Figs. 4.2 and 3.4 are identical, even though Figs. 4.1 and 3.2 are not. In other words, record ordering is irrelevant so far as the Field Values Table is concerned. (By contrast, field ordering is not irrelevant; that is, the left-to-right column ordering of the Field Values Table is the same as the left-to-right field ordering in the corresponding file. However, this point isn’t very important so far as the user is concerned.) Incidentally, it should be immediately clear from the example that one way to think about TR is that it’s a technology that stores the data “attribute-wise” rather than “tuple-wise”—though I hasten to add that this informal characterization doesn’t even begin to capture all of the implications and advantages of the TR approach. Now, although by contrast most mainstream SQL products store the data “tuple-wise” (as we saw in Chapter 2), there have been a few systems, both prototypes and commercial products, that have stored the data “attribute-wise” instead (see, for example, references [2], [49], [52], [65], and [66]); indeed, some of those products are still available in the marketplace at the time of writing. But none of those systems carried (or carry) the “attribute-wise” idea to anything like the same lengths that TR does. Note: By the same token, some of those systems used or use various kinds of data compression on the attributes, too, but again not nearly to the same extent that TR does (see Chapters 8 and 9). It should also be clear from the example that TR takes the concept of data independence much further than previous systems have done. To be specific, there’s essentially no concept of a user-level tuple at all at the TR level, whereas (again as we saw in Chapter 2) conventional systems typically do store direct images of user-level tuples, albeit in a variety of different ways. (Even those systems that store data “attribute-wise” still retain fairly close ties between the user level and the physical storage level—for example, by ensuring that the attribute values from a given user-level tuple all appear at the same relative position within the individual attribute representations.) Anyway, let’s get back to the Field Values Table. I’m clearly not in a position yet to describe exactly how that table is used, nor to explain its advantages (I need to discuss the Record Reconstruction Table first); nevertheless, I’d still like to mention a few points that I think should at least make some intuitive sense, even before we start to look at the Record Reconstruction Table as such. • First of all, the fact that each column of the Field Values Table is in sorted order is clearly going to help with user-level ORDER BY requests. For example, a request to see suppliers in city name sequence shouldn’t require a run-time sort, nor an index. • The same is true of a request to see suppliers in reverse city name sequence (meaning descending sort order, instead of ascending)—the implementation can simply process the Field Values Table bottom to top instead of top to bottom.


Go Faster!

Core Concepts

• Analogous remarks apply to every single attribute; that is, the Field Values Table effectively represents several different sort orders simultaneously (in effect, a sort order in both directions on every individual attribute). • Requests involving specific value lookups—for example, a request to see suppliers in London—can be implemented by means of a binary search. And, again, analogous remarks apply to every attribute. Note: Binary search is also known as logarithmic search, on account of the fact that it’s an O(log N) algorithm, where N is the number of items in the list to be searched and O(log N) means the execution time is proportional to log N (O here stands for “order of magnitude”). Sequential search, by contrast, is an O(N) algorithm. For example, if N = 1,000,000, then we might say, loosely, that binary search is some 50,000 times more efficient than sequential search. These points will all be expanded and made clearer in Sections 4.4, 4.5, and 4.6 below. For now, here are a couple of final remarks to close out this section: • In some respects, the Field Values Table can be thought of as a kind of bridge between the user perception of the data (meaning the original user-level relation and/or the corresponding file) and other internal TR structures. Note in particular that the Field Values Table is the only TR table that contains user data as such—all of the others contain internal information, encoded in ways that make sense to TR but aren’t directly relevant to, or exposed to, the user at all. • As I explained in Chapter 2, at the end of Section 2.2, there’s only one physical sequence available to us at the hardware level, so we want to make the best use of it we can. In the TR approach, we store the Field Values Table in physical sequence by row number. (It should be clear from what I said a few paragraphs back—regarding, for example, ORDER BY requests—that we often need to process the Field Values Table sequentially by row number, so storing it as just indicated is clearly advantageous.) Of course, storing the Field Values Table in physical sequence in this manner doesn’t preclude us from exploiting physical sequence appropriately for other internal structures as well, but it’s vitally important that we do so in the case of the Field Values Table in particular.

4.4

The Record Reconstruction Table

Fig. 4.3 shows the Field Values Table from Fig. 4.2 side by side with an appropriate Record Reconstruction Table. Note that the two tables both have the same number of rows and columns; indeed, there’s a direct one-to-one correspondence between the cells of the two tables, as we’ll see in a moment. (In fact, each table has the same number of rows and columns as the file in Fig. 4.1 has records and fields, respectively.) Note too that the entries in the Record Reconstruction Table cells aren’t supplier numbers or supplier names (etc.) any longer; instead, they’re row numbers, and those row numbers can be thought of as pointers to the rows of either or both of the Field Values Table and the Record Reconstruction Table, depending on the context in which they’re used. (For this reason, the columns in the Record Reconstruction Table really ought not to be labeled S#, SNAME, etc., as I’ve shown them in the figure; however, I think those labels help to make certain later explanations easier to follow.) Note: You might want to keep a copy of the Record Reconstruction Table from Fig. 4.3 by you as well for purposes of subsequent reference.


Go Faster!

Core Concepts

Fig. 4.3: Field Values Table of Fig. 4.2 and a corresponding Record Reconstruction Table

Now, I deliberately don’t want to get into details just yet as to how the Record Reconstruction Table is built in the first place; instead, I want to show how it’s used. To that end, please consider the following sequence of operations. (Recall from Chapter 3 that, in the subscript expression [i,j], i is a row number and j is a column number.) Step 1: Go to cell [1,1] of the Field Values Table and fetch the value stored there—namely, the supplier number S1. That value is the first field value (that is, the S# field value) within a certain supplier record in the suppliers file. Step 2: Go to the same cell (that is, cell [1,1]) of the Record Reconstruction Table and fetch the value stored there—namely, the row number 5. That row number is interpreted to mean that the next field value (which is to say, the second or SNAME value) within the supplier record whose S# field value is S1 is to be found in the SNAME position of the fifth row of the Field Values Table—in other words, in cell [5,2] of the Field Values Table. Go to that cell and fetch the value stored there (supplier name Smith).






Go Faster!

Core Concepts

Step 3: Go to the corresponding Record Reconstruction Table cell [5,2] and fetch the row number stored there (3). The next (third or STATUS) field value within the supplier record we’re reconstructing is in the STATUS position in the third row of the Field Values Table—in other words, in cell [3,3]. Go to that cell and fetch the value stored there (status 20). Step 4: Go to the corresponding Record Reconstruction Table cell [3,3] and fetch the value stored there (which is 3 again). The next (fourth or CITY) field value within the supplier record we’re reconstructing is in the CITY position in the third row of the Field Values Table—in other words, in cell [3,4]. Go to that cell and fetch the value stored there (city name London). Step 5: Go to the corresponding Record Reconstruction Table cell [3,4] and fetch the value stored there (1). Now, the “next” field value within the supplier record we’re reconstructing looks like it ought to be the fifth such value; however, supplier records have only four fields, so that “fifth” wraps around to become the first. Thus, the “next” (first or S#) field value within the supplier record we’re reconstructing is in the S# position in the first row of the Field Values Table—in other words, in cell [1,1]. But that’s where we came in, and the process stops. As I hope you can see, the foregoing sequence of operations allows us to reconstruct one particular record from the suppliers file—to be specific, the one shown as record number 4 in Fig. 4.1:

(I don’t mean to suggest that the record number itself—4, in the example—is produced in the reconstruction process; I’ve shown it here merely to help you relate the output from that process back to the file as shown in Fig. 4.1.) By the way, note how the row-number pointers we followed in the foregoing example form a ring—in fact, two isomorphic rings, one in the Field Values Table and one in the Record Reconstruction Table. See Fig. 4.4.

Fig. 4.4: Pointer rings (examples)


Go Faster!

Core Concepts

As an exercise—Exercise 12—I strongly recommend you try reconstructing another supplier record for yourself. If you start with cell [2,1] in the Field Values Table, you should obtain record number 3 from Fig. 4.1:

Similarly, starting with cell [3,1] gives record 5; starting with cell [4,1] gives record 1; and starting with cell [5,1] gives record 2. Observe the net effect: If we process the entire Field Values Table in supplier number order by going top to bottom down the S# column—that is, if we carry out the record reconstruction process five times, starting respectively with cells [1,1], [2,1], [3,1], [4,1], and [5,1], in that order—then we reconstruct a version of the entire original suppliers file in which the records appear in ascending supplier number order. In other words, we’ve just implemented the following SQL query— SELECT S.S#, S.SNAME, S.STATUS, S.CITY FROM S

ORDER BY S# ; Likewise, to implement this SQL query— SELECT S.S#, S.SNAME, S.STATUS, S.CITY FROM S

ORDER BY S# DESC ; (where DESC means descending sequence)—all we have to do is process the supplier number column of the Field Values Table in reverse order and do the reconstructions starting from cell [5,1], then [4,1], and so on. What’s more, we haven’t had to do a run-time sort in either case, nor have we had to use an index. Ordering by Other Attributes Now consider this SQL query: SELECT S.S#, S.SNAME, S.STATUS, S.CITY FROM S

ORDER BY STATUS ; Precisely because (as noted earlier) the pointers in the Record Reconstruction Table form rings, we can enter those rings at any point. When we apply the reconstruction algorithm, therefore, we can start at any cell we like. In particular, if we start with cell [1,3]—that is, the first cell in the STATUS column—we obtain the record:


Go Faster!

Core Concepts

(More precisely, we obtain a version of this record in which the left-to-right field ordering is STATUS, then CITY, then S#, then SNAME.) Following on down the STATUS column—that is, starting the reconstruction process successively with cells [2,3], [3,3], [4,3], and [5,3]—we’ll eventually obtain the entire suppliers file in ascending status order. In analogous fashion, if we process the Record Reconstruction Table in sequence by entries in the SNAME column, we obtain the suppliers file in ascending supplier name order; likewise, if we process it in sequence by entries in the CITY column, we obtain the file in ascending city name order. In other words, the Record Reconstruction Table and the corresponding Field Values Table together represent all of these orderings simultaneously—without (to repeat) any need for either indexes or run-time sorting. This fact constitutes one of the major benefits of the TR approach. By the way, this is as good a point as any to mention that the reconstruction algorithm is known informally as the zigzag algorithm (and the individual pointer rings are known as zigzags), for obvious reasons.



Go Faster!

Core Concepts

And by the way again: Notice that, to be precise, we can’t sensibly talk about the Record Reconstruction Table that corresponds to a given Field Values Table; rather, we have to talk in terms of the Record Reconstruction Table that corresponds to a given file (and therefore, in a sense, to the unique Field Values Table that corresponds to that file as well). The reason is that—obviously enough—several logically distinct files can all have the same Field Values Table, and such files will clearly need different Record Reconstruction Tables in order to support the corresponding reconstruction process properly. For example, this state of affairs would obtain if we had a suppliers file that was identical to the one shown in Fig. 4.1 except that supplier S1 was named Jones and supplier S2 was named Smith. Equality Restrictions Now let’s take a look at an SQL query involving a simple equality restriction: SELECT S.S#, S.SNAME, S.STATUS, S.CITY FROM S

WHERE S.CITY = ‘London’ ; Since the CITY column (like every column) of the Field Values Table is kept in sorted order, a binary search—or simple variant thereof—can be used to find the cells containing London. Given the Field Values Table of Fig. 4.2, those cells turn out to be [2,4] and [3,4]. Zigzags can now be constructed by following the pointer rings running through cells [2,4] and [3,4] of the Record Reconstruction Table. In the example, those zigzags look like this: [2,4], [4,1], [3,2], [2,3] and [3,4], [1,1], [5,2], [3,3] Superimposing these zigzags on the Field Values Table, we obtain the field values for the desired records:


Go Faster!

Core Concepts

Other User-Level Operations It should be clear that the Field Values Table and the Record Reconstruction Table together offer direct support for many other user-level operations too, in addition to simple ORDER BY and equality restriction operations. In fact, most if not all of the fundamental relational operations—restrict, project, join, summarize, and others (not to mention the operation of duplicate elimination, which is needed internally, even in true relational systems)—have implementation algorithms that rely on the ability to access the data in some specific sequence. By way of example, consider join. We saw in Chapter 2 that sort/merge is a good way to implement join. Well, TR lets us do a sort/merge join without having to do the sort!— or, at least, without having to do the run-time sort (the sort’s done when the Field Values and Record Reconstruction Tables are built, which is to say at load time, loosely speaking). Suppose, for example, that the database involves a parts relation as well as the suppliers relation, and suppose both relations have a CITY attribute. In order to join suppliers and parts over city names, then, we simply have to access each of the two Field Values Tables in city name sequence and do a merge-style join. One important implication of all of the above is that life becomes much easier for the system optimizer; to be more specific, the access path selection process (see Chapter 2) becomes much simpler—even completely unnecessary, in some cases. Another implication is that many of the auxiliary structures found in traditional DBMSs become unnecessary too (though it might be a good idea to use hashing on either the Field Values Table or the Record Reconstruction Table or both, if those tables get very large3). Yet another implication is that physical database design becomes much easier, involving as it does far fewer options and choices, and the same is true for performance tuning. For further discussion of the use of TR structures in implementing the relational operators, see Chapter 10. Meanwhile, I’ll close this section with a nice analogy that might help you understand and remember how the Field Values and Record Reconstruction Tables fit into the overall scheme of things: • The Field Values Table is like a parts list that’s used in some manufacturing process. • The Record Reconstruction Table is like instructions for assembling parts—that is, instructions for using that parts list to manufacture finished products. Incidentally, it should be clear from this analogy that the “assembly” process is bound to have some associated costs, especially in a disk-based environment, and we clearly want to keep those costs to a minimum. I’ll address this issue in subsequent chapters.

4.5

Building the Record Reconstruction Table

I’ve now shown in outline what the Record Reconstruction Table looks like and how it’s used, but I haven’t shown how it’s built in the first place. Now it’s time to take a look at this latter question. Please note, however, that I’ll be revisiting this topic at several points in later chapters (as well as in the final section of the present chapter); all I want to do for the moment is consider the simple case. Once again I’ll base my discussions and explanations on the suppliers file shown in Fig. 4.1, together with the corresponding Field Values Table shown in Fig. 4.2 and repeated in Fig. 4.3.


Go Faster!

Core Concepts

Note first that the Record Reconstruction Table is built directly from the file (the Field Values Table plays no part in the process at all). We begin by considering the effect of applying various sort orderings to that file. For example, if we sort the file by ascending supplier number, we get the records in the sequence 4, 3, 5, 1, 2. I’ll call this sequence the record permutation corresponding to the ordering “ascending S#” (the S# permutation for short). Other permutations are as follows: • Ascending SNAME:

2, 5, 1, 3, 4

• Ascending STATUS:

3, 1, 4, 2, 5

• Ascending CITY:

2, 1, 4, 3, 5

We can summarize these permutations by means of the following Permutation Table:



Go Faster!

Core Concepts

Note: It follows from the way we built it that, in this table, cell [i,j] contains the record number within the suppliers file of the record that appears in the ith position when that file is sorted by ascending values of the jth field. (You might want to read that sentence again.) For example, cell [3,2] contains the value 1; if the original file is sorted by ascending SNAME value—SNAME being the second field—the record that appears in the third position is indeed record number 1 (since that record contains the third lowest SNAME value, Clark). Now, the foregoing Permutation Table is not the desired Record Reconstruction Table, but it could certainly be used to perform the function of that table (that is, it could be used to reconstruct records of the original file), as follows. Suppose we want to reconstruct the fourth record of that file. Noting that the value 4 appears in the first position in column 1, the fifth position in column 2, the third position in column 3, and the third position again in column 4, we can go to the Field Values Table and pick out the supplier number in cell [1,1], the supplier name in cell [5,2], the status value in cell [3,3], and the city name in cell [3,4], to obtain the record (once again)

In other words, the sequence of Permutation Table cells [1,1], [5,2], [3,3], [3,4] indicates that record number 4 appears first in the S# permutation (“ORDER BY S#”), fifth in the SNAME permutation (“ORDER BY SNAME”), third in the STATUS permutation (“ORDER BY STATUS”), and third again in the CITY permutation (“ORDER BY CITY”). And if that sequence of Permutation Table cells seems familiar, then so it should—it’s exactly the sequence of cells we passed through (albeit in the Field Values and Record Reconstruction Tables, not the Permutation Table) when we were reconstructing record number 4 in the previous section (Section 4.4). Now, the trouble with the foregoing algorithm—the algorithm, that is, for reconstructing records from the Permutation Table—is that the record numbers are effectively stored in each column of that table in random order. As a consequence, a sequential search is needed to find the desired record number (4, in the example) in each column. However, we can overcome this difficulty by using the Record Reconstruction Table in place of the Permutation Table. The Record Reconstruction Table differs from the Permutation Table in the following important respect: Where the Permutation Table has a sequence of cells (one cell per column) that each contain some particular record number, the Record Reconstruction Table has a sequence of cells (corresponding to the record with that record number) that each contain a pointer to the next cell in that sequence.


Go Faster!

Core Concepts

(As we know, the pointers in question are row numbers, and those row numbers identify both rows in the Record Reconstruction Table itself and rows in the corresponding Field Values Table.) Thus, for example, considering only record number 4, the Permutation Table looks like this:

By contrast, the Record Reconstruction Table looks like this (I’ve shown the pointer ring or zigzag explicitly for the sake of the example)—

—as indeed we already know from the previous section. And of course it’s much faster to follow a ring of pointers than to do a series of sequential searches. Incidentally, note that the zigzag just shown in the Record Reconstruction Table—unlike its counterpart in the Permutation Table—includes no information as to which particular record in the suppliers file it corresponds to; all we know is that the cells linked together in that zigzag do all correspond to the same record in that file. But no information has really been lost, because the original record orderings (and hence record numberings) were arbitrary anyway. In other words, if there are M records altogether, we can in principle generate M! (“factorial M” = M * (M‑1) * (M-2) * ... * 3 * 2 * 1) different versions of the original file from the same Record Reconstruction Table. Of course, those versions are all informationequivalent, as explained in Chapter 3. (In contrast to the foregoing paragraph, the Record Reconstruction Table does still include information regarding the left-to-right field ordering of the corresponding file, inasmuch as its left-to-right column ordering is exactly that ordering. However, this fact, although it does have some bearing on certain internal operations, is irrelevant to the user at the relational level.)


Go Faster!

Core Concepts

Here then is the algorithm for building the Record Reconstruction Table from the Permutation Table: Step 1: Let PT be the Permutation Table. Build a table RRT with the same number of rows and columns as PT and with all cells empty. Step 2: For all records in the user file, do Step 3. Step 3: For all columns of PT, do Step 4. Step 4: Let the current record of the user file be the rth record, and let the current column of PT be the jth column. Let cell [i,j] of PT be that cell of column j that contains the record number r. At cell [i,j] of RRT, place the value i' where cell [i',j+1] of PT is that cell of column j+1 that contains the record number r. If column j is the last column, take column j+1 as the first column. After this algorithm has been executed, RRT is the desired Record Reconstruction Table. As an exercise (Exercise 2), you might like to check that the foregoing algorithm, when applied to the Permutation Table shown earlier in this section together with the suppliers file of Fig. 4.1, does indeed yield the Record Reconstruction Table shown in Fig. 4.3. You might also like to check—this is Exercise 3—that the Record Reconstruction Table shown in Fig. 3.5 in the previous chapter is correct for the file shown in Fig. 3.2 (and the Field Values Table shown in Fig. 3.4). By the way, did you notice that the Record Reconstruction Tables shown in Figs. 3.5 and 4.3 are different? Why do you think that is?



Go Faster!

4.6

Core Concepts

The Record Reconstruction Table is not Unique

Well, you probably answered the question at the end of the previous section easily enough: The Record Reconstruction Tables of Figs. 3.5 and 4.3 are different because the Permutation Tables from which they were built are different. And the reason the Permutation Tables are different is because they in turn were built from different versions of the original suppliers file, with different record orderings. Of course, the differences in question aren’t very important, in a sense, because every possible record ordering in the original file can in principle be reconstructed from either of the two Record Reconstruction Tables. However, there’s another reason (a more important reason) why the Record Reconstruction Table is, in general, not unique. Indeed, we can obtain different Record Reconstruction Tables even without starting from different versions of the file, as I’ll now demonstrate. First of all, consider the Permutation Table from the previous section once again:

For definiteness, let’s focus on the STATUS permutation, which, if you’ll glance back at the beginning of the previous section, you’ll see is 3, 1, 4, 2, 5 (as indeed you can also see from column 3 of the Permutation Table itself). As you’ll recall, the meaning of that permutation is that if we sort the suppliers file of Fig. 4.1 by ascending status value, the records of that file will appear in the indicated sequence 3, 1, 4, 2, 5. However, I wasn’t being entirely honest with you when I discussed these ideas previously. Since records 1 and 4 (for suppliers S4 and S1, respectively) both contain the status value 20, and records 2 and 5 (for suppliers S5 and S3, respectively) both contain the status value 30, the STATUS permutation is not unique. In fact, there are four possible STATUS permutations that are all equally valid (and all equivalent, in a sense): • 3, 1, 4, 2, 5 • 3, 4, 1, 2, 5 • 3, 1, 4, 5, 2 • 3, 4, 1, 5, 2


Go Faster!

Core Concepts

Analogous remarks apply to the CITY permutation, though not to the S# permutation (nor to the SNAME permutation, as it happens, in this particular example). It follows from the foregoing that the Permutation Table is not unique, and hence that the Record Reconstruction Table is not unique either. For example, here’s another valid Permutation Table corresponding to the file of Fig. 4.1:

And here’s the corresponding Record Reconstruction Table:

Let’s just confirm that this Record Reconstruction Table can indeed be used to reconstruct the records of the original file of Fig. 4.1. Let’s start (arbitrarily) at cell [4,1]. Then: • Cell [4,1] of the Field Values Table contains the supplier number S4; cell [4,1] of the Record Reconstruction Table contains 3, so next we go to cell [3,2]. • Cell [3,2] of the Field Values Table contains the supplier name Clark; cell [3,2] of the Record Reconstruction Table contains 3 again, so next we go to cell [3,3]. • Cell [3,3] of the Field Values Table contains the status value 20; cell [3,3] of the Record Reconstruction Table contains 2, so next we go to cell [2,4]. • Cell [2,4] of the Field Values Table contains the city name London; cell [2,4] of the Record Reconstruction Table contains 4, and we’re back where we started, having reconstructed the supplier record:


Go Faster!

Core Concepts

The fact that the Record Reconstruction Table is, in general, nonunique in the foregoing sense will turn out to be very important in Chapter 7. Note, however, that although the Record Reconstruction Table is indeed nonunique as we’ve just seen, in what follows I’ll continue to talk in terms of “the” Record Reconstruction Table much of the time, just to keep things simple.

Endnotes 1. I’ve split the material across two chapters simply because there’s such a lot of ground to cover—I didn’t want you to have to deal with one great big monolithic and indigestible chapter, especially at this point in the book, and especially when the topics involved are so fundamental. 2. The reference is to Exercise 1 in Appendix A. I’ll follow this numbering style for exercises throughout the rest of the book. (By the way, this is as good a place as any to remind you that Appendix A doesn’t just contain the exercises as originally stated—it also includes much of the necessary background material. In the case of Exercise 1, for example, it includes a repeat of the Field Values Table and Record Reconstruction Table from Fig. 4.3 and a repeat of the pointer rings from Fig. 4.4.) 3. The hash in question would have to be indirect, however [48,60]—it couldn’t be a simple “direct” hash as described in Chapter 2, because of the inherently ordered nature of both the Field Values Table and the Record Reconstruction Table. But we can have as many hashes as we like, so long as they are indirect; in the very unlikely extreme, we could even have a hash on every column of each of the two tables. Note, however, that the performance improvements that hashing might provide are likely to be small in comparison to the fundamental improvements that the TR structures offer in the first place.


Go Faster!


5 Core Concepts (Continued) 5.1 Introduction This chapter continues our examination of the core constructs of the TR model (principally the Field Values and Record Reconstruction Tables). However, the chapter is rather more of a potpourri than the previous one. Its structure is as follows. Following this short introductory section, Section 5.2 offers some general observations regarding performance. Section 5.3 then briefly surveys the TR operators, and Sections 5.4 and 5.5 take another look at how the Record Reconstruction Table is built and how record reconstruction is done. Sections 5.6 and 5.7 describe some alternative perspectives on certain of the TR constructs introduced in Chapter 4. Finally, Section 5.6 takes a look at some alternative ways of implementing some of the TR structures and algorithms also first described in that previous chapter.

5.2


It seems to me undeniable that the mechanisms described in the previous chapter for representing and reconstructing records and files are vastly different from those found in conventional DBMSs, and I presume you agree with this assessment. At the same time, however, they certainly look pretty complicated ... How does all of that complexity square with the claims I made in Chapter 1 regarding good performance? Let me remind you of some of the things I said there: [TR is] a technology that lets us build database management systems (DBMSs) that are ... orders of magnitude faster than any previous system ... [A] relational system ... using TR technology should dramatically outperform even the fastest of those [previous] systems ... [and] I don’t just mean that queries should be faster ... [Updates] should be faster as well. —from Chapter 1 Well, let me say a little more now regarding query performance specifically (I haven’t really discussed updates yet, so I’ll have to come back to the question of update performance later—actually in the next chapter). Now, any given query involves two logically distinct processes: a) Finding the data that’s required, and then b) Retrieving that data. TR is designed to exploit this fact. Precisely because it separates field value information and linkage information, it can treat these two processes more or less independently. To find the data, it uses the Field Values Table; to retrieve it, it uses the Record Reconstruction Table. (These characterizations aren’t 100 percent accurate, but they’re good to a first approximation—good enough for present purposes, at any rate.) And the Field Values Table in particular is designed to make the finding of data very efficient (for example, via binary search), as we saw in Chapter 4. Of course, it’s true that subsequent retrieval of that data then involves the record reconstruction process, and this latter process in turn involves a lot of pointer chasing, but:


Go Faster!


• Even in a disk-based implementation, the system will do its best to ensure that pertinent portions of both the Field Values Table and the Record Reconstruction Table are kept in main memory at run time, as we’ll see in Part III. Assuming this goal is met, the reconstruction will be done at main-memory speeds. • The “frills” to be discussed in Chapters 7‑9 (as well as others that are beyond the scope of this book) have the effect, among other things, of dramatically improving the performance of various aspects of the reconstruction process. • Most important of all: Almost always, finding the data that’s wanted is a much bigger issue than returning that data to the user is. In a sense, the design of the TR internal structures is biased in favor of the first of these issues at the expense of the second. Observe the implication: The more complex the query, the better TR will perform—in comparison with traditional approaches, that is. (Of course, I don’t mean to suggest by these remarks that record reconstruction is slow or inefficient—it isn’t—nor that TR performs well on complex queries but not on simple ones. I just want to stress the relative importance of finding the data in the first place, that’s all.)

93%



5 Specializations




[email protected]



55 Nationalities

in class



Go Faster!


I’d like to say more on this question of query performance. In 1969, in his very first paper on the relational model [5], Codd had this to say: Once aware that a certain relation exists, the user will expect to be able to exploit that relation using any combination of its attributes as “knowns” and the remaining attributes as “unknowns,” because the information (like Everest) is there. This is a system feature (missing from many current information systems) which we shall call (logically) symmetric exploitation of relations. Naturally, symmetry in performance is not to be expected. —E. F. Codd Note: I’ve reworded Codd’s remarks just slightly here. In particular, the final sentence (the caveat concerning performance) didn’t appear in the original 1969 paper [5] but was added in the expanded 1970 version [6]. Anyway, the point I want to make is that the TR approach gives us symmetry in performance, too—or, at least, it comes much closer to doing so than previous approaches ever did. This is because, as we saw in Chapter 4, the separation of field values from linkage information effectively allows the data to be physically stored in several different sort orders simultaneously. When Codd said “symmetry in performance is not to be expected,” he was tacitly assuming a directimage style of implementation, one involving auxiliary structures like those described in Chapter 2. However, as I said in that chapter: [Auxiliary structures such as pointer chains and] indexes can be used to impose different orderings on a given file and thus (in a sense) “level the playing field” with respect to different processing sequences; all of those sequences are equally good from a logical point of view. But they certainly aren’t equally good from a performance point of view. For example, even if there’s a city index, processing suppliers in city name sequence will involve (in effect) random accesses to storage, precisely because the supplier records aren’t physically stored in city name sequence but are scattered all over the disk. —from Chapter 2 As we’ve seen, however, these remarks simply don’t apply to the TR data representation. And now I can address another issue that might possibly have been bothering you. We’ve seen that the TR model relies heavily on pointers. Now, the CODASYL “network model” [14,25] also relies heavily on pointers—as the “object model” [3,4,28,29] and “hierarchic model” [25,56] both do also, as a matter of fact—and I and many other writers have criticized it vigorously in the past on exactly that score (see, for example, references [10], [21], and [37]). So am I arguing out of both sides of my mouth here? How can TR pointers be good while CODASYL pointers are bad?


Go Faster!


Well, in fact there are several differences between TR pointers and CODASYL pointers. The biggest of those differences has to with the question of the target audience: Who is the user1 of the technology supposed to be in each case? • The target audience for TR is clearly system programmers, whose job it is to build DBMSs and other data management systems—for example, data mining tools—on top of a TR implementation. In other words, a TR implementation, viewed in isolation, is not and is not meant to be a complete DBMS as such, and the TR model is not and is not meant to be the application programming interface to such a DBMS. • By contrast, the target audience for CODASYL is application programmers, whose job it is to build application systems—for example, a payroll system—on top of a CODASYL DBMS. In other words, a CODASYL implementation definitely is (or was) meant to be a complete DBMS as such,2 and the “CODASYL model” is (or was) meant to be the application programming interface to such a DBMS. And, of course, it’s well established that system programmers do need to be able to make use of pointers, for all kinds of reasons. On the other hand, it’s equally well established that allowing—or, worse, requiring—application programmers to make use of pointers is a very bad idea, again for all kinds of reasons (indeed, this fact is a major justification for the exclusion of pointers from the relational model [40]). Just as an aside, I simply can’t let the foregoing remarks go by without mentioning the distressing fact that, as I write, most of the mainstream SQL vendors (following the current SQL standard [53]) are busily incorporating pointers—pointers, that is, that are visible to the application programmer—into their “model.” Reference [40] refers to this “feature” of SQL as a Great Blunder, and explains just why it is a blunder; for example, it shows among other things that pointers and a good model of type inheritance are fundamentally incompatible. And reference [30] gives numerous additional reasons as to why the relational model should categorically not be extended or “improved” to include pointers. Back to the comparison with CODASYL. Another big difference between CODASYL pointers and TR pointers is that CODASYL pointers apply at the record level, while TR pointers apply at the field level. One consequence of this difference is that CODASYL structures are in fact parent/child structures, in the sense of Chapter 2; as a direct consequence, they suffer from all of the problems of such structures identified in that chapter. In particular, therefore, while CODASYL pointers might in principle be used to provide “symmetric exploitation” (although they certainly aren’t used that way in practice), they certainly don’t provide symmetry in performance, because the records can be physically clustered in at most one way (again, see Chapter 2). The same is not true with TR pointers, as we know.

5.3

TR Operators

Now we come to another issue that I’ve been ducking slightly so far. I’ve claimed repeatedly that TR is a model. As such, it must provide some operators to operate on the “objects”—the Field Values Table, etc.—that I’ve been concentrating on so far (as well as providing those “objects” themselves, of course). So what are the TR operators?


Go Faster!


Well, at the most fundamental level, of course, TR certainly includes everything necessary to build, search, access, and maintain tables such as the Field Values Table, including in particular all of the obvious subscripting, assignment, and comparison operators. It also includes operators for allocating and deallocating storage and carrying out other such utility functions. All of these operators are only to be expected. At a slightly higher level, TR also includes a set of operators that are described in some detail in reference [63]. However, most of those “higher-level” operators are still quite low-level in nature; indeed, most of them are intended for use in the implementation of still higher-level operators that will presumably be used by the system programmers mentioned in the previous section. For that reason, I don’t think it’s worth getting into details of those lower-level operators here. However, I do want to say a little about the “system programming interface” ones, even though those operators aren’t really primitive operators of the TR model as such. (Indeed, reference [63] shows how they could actually be implemented in terms of the lower-level operators that it does describe.) Let’s assume that techniques such as those discussed in Chapter 4—for example, binary searches on columns of the Field Values Table—have already been used to determine that some particular record is of interest. Let me immediately explain what I mean when I say that some record has been “determined to be of interest.” To be specific: • When I say “some particular record,” I mean a record of the applicable user file.






Get your free trial





Go Faster!


• When I say such a record is “of interest,” I mean we want the record in question—or the tuple corresponding to that record, rather—to be retrieved, deleted, or updated. (Inserting a new record or tuple is different, of course; we can’t sensibly talk about the new record having been “determined to be of interest,” because the record doesn’t exist yet—at least, not in the database.) • And when I say techniques have been used “to determine” that the record in question is of interest, I mean it’s been determined that some cell [i,j] of the Record Reconstruction Table corresponds to some portion of that record; more precisely, it’s been determined that cell [i,j] of the Record Reconstruction Table contains a pointer that points to a cell in the Field Values Table that contains some portion of that record. In other words, the record in question is that unique record that corresponds to that particular cell [i,j] of the Record Reconstruction Table. It’s convenient to say, loosely, that the record in question “passes through” that cell [i,j] of the Record Reconstruction Table. With all of that preamble out of the way, then, the TR operators I want to consider are as follows: • Retrieve the record passing through cell [i,j] of the Record Reconstruction Table. • Delete the record passing through cell [i,j] of the Record Reconstruction Table. • Update the record passing through cell [i,j] of the Record Reconstruction Table. • Insert a new record. Of these operators, retrieve has effectively been discussed at length already in Chapter 4—it’s essentially just the business of record reconstruction as described in that chapter (in Section 4.4 in particular). The other three operators are discussed in detail in the next chapter. One last remark to close the present section: If you happen to be familiar with traditional approaches to implementing the relational model, you might have been expecting to see certain other operators mentioned in the discussion above. For example, the System R prototype [1] consisted of a frontend called the Relational Data System (RDS) and a backend called the Relational Storage System (RSS);3 the RDS translated user requests—SQL statements, in other words—into RSS operations, and those RSS operations performed such functions as searching indexes, committing and rolling back transactions, and so forth. And those RSS operators included many things that have no direct counterpart in the TR model at all. Some of those operators (for example, those to do with indexes) are omitted from TR because TR simply has no need of them. However, others (for example, COMMIT and ROLLBACK) are omitted because such functionality is meant to be provided above the TR interface. (Indeed, the RSS was really an entire multiuser DBMS in its own right, albeit one whose user interface was rather low-level. By contrast, TR—or a TR implementation, rather—is not a complete DBMS in its own right; rather, it’s meant among other things to serve as the storage manager component for such a DBMS.)


Go Faster!

5.4


Building the Record Reconstruction Table: An Alternative Approach

In the introduction to Chapter 4, I said I’d occasionally make some mention of alternative implementation schemes for certain aspects of the TR model. In keeping with that promise, I’d now like to take a look at an alternative way of building the Record Reconstruction Table. Note: It might help to repeat the point from Chapter 4 that the Record Reconstruction Table is built directly from the file (the Field Values Table isn’t involved in the process at all). Now, you might recall that in Chapter 4, Section 4.5, I showed how we could use the Permutation Table instead of the Record Reconstruction Table in order to perform the record reconstruction process. However, I also said it wouldn’t be very efficient to use the Permutation Table in that way, because we’d have to do sequential searches on the columns of that table in order to find the record numbers (that’s why we replaced the Permutation Table by the Record Reconstruction Table in the first place). The trouble is, though, the algorithm for building the Record Reconstruction Table from the Permutation Table still involves doing those same sequential searches—admittedly only when the Record Reconstruction Table is built, not every time it’s used, but those searches still represent overhead, and it would be nice to eliminate that overhead if we can. It turns out we can improve matters by exploiting the inverses of the permutations in the Permutation Table. Consider once again the original Permutation Table from Chapter 4, Section 4.5 (see Fig. 5.1). As you can see from that table, the S# permutation (for example) is the sequence 4, 3, 5, 1, 2 The meaning, to remind you, is that if the records of the original file (see Fig. 4.1 in Chapter 4) are sorted into ascending S# order, record 4 will appear first, record 3 will appear second, and so on. And the inverse of this permutation is the sequence 4, 5, 2, 1, 3 This inverse permutation is that unique permutation that, if applied to the original sequence 4, 3, 5, 1, 2, will produce the sequence 1, 2, 3, 4, 5. (If SEQ is the original sequence 4, 3, 5, 1, 2, then the fourth entry in SEQ is 1, the fifth is 2, the second is 3, and so on.)

Fig. 5.1: Permutation Table for the suppliers file of Fig. 4.1


Go Faster!


More generally, if we think of any given permutation as a vector V, then the inverse permutation V' can be obtained in accordance with the simple rule that if V[i] = i', then V'[i'] = i. Applying this rule to each of the permutations in our given Permutation Table, we obtain the Inverse Permutation Table shown in Fig. 5.2. (Exercise 4: Check that the table is correct.)

Fig. 5.2: Inverse Permutation Table corresponding to Fig. 5.1

We can now use the Inverse Permutation Table to build the Record Reconstruction Table without doing any sequential searches. For example, the first (S#) column of the Record Reconstruction Table can be built as follows: Go to cell [i,1] of the Inverse Permutation Table. Let that cell contain the value r; also, let the next cell to the right, cell [i,2], contain the value r'. Go to the rth row of the Record Reconstruction Table and place the value r' in cell [r,1].





22-08-2014 12:56:57


Go Faster!


Executing this algorithm for i = 1, 2, ..., 5 yields the entire S# column of the Record Reconstruction Table. The other columns are built analogously. Exercise 5: Check that the foregoing algorithm, when applied to the given Inverse Permutation Table, does indeed produce the Record Reconstruction Table shown in Fig. 4.3 in Chapter 4. (Doing this exercise should convince you that this algorithm is much easier to apply than the one given in Chapter 4; it should also make you understand why the algorithm works, if you haven’t figured it out already. In future chapters, when I need to build a Record Reconstruction Table, I’ll use this new algorithm.)

5.5

Record Reconstruction Revisited

Like the previous section, this one too is concerned with a possible implementation alternative. In that previous section, the Permutation and Inverse Permutation Tables served as purely temporary structures, used in building the Record Reconstruction Table but then discarded. However, it would be possible not to discard them after all, but rather to use them together as a replacement for the Record Reconstruction Table. For example, consider the following SQL query (a projection of a restriction): SELECT S.S#, S.STATUS FROM S

WHERE S.CITY = ‘London’ ; We can implement this query as follows: • Step 1: Use a binary search to find the London entries in the Field Values Table (see Fig. 4.2 in Chapter 4) and extract the corresponding row numbers. In the example, this step yields the row numbers 2 and 3. • Step 2: Use those row numbers to look up entries in the CITY column of the Permutation Table (see Fig. 5.1). This step yields the corresponding record numbers, 1 and 4. • Step 3: Use those record numbers as row numbers to look up entries in the S# column of the Inverse Permutation Table (see Fig. 5.2). This step yields the row numbers 4 and 1, and these values can be used to access the corresponding S# values in the Field Values Table, S1 and S4. • Step 4: Likewise, use the record numbers from Step 2 to look up entries in the STATUS column of the Inverse Permutation Table. This step yields the row numbers 2 and 3, and these values can be used to access the corresponding status values in the Field Values Table, which are both 20, as it happens. Execution of the query is now complete. Comparing the foregoing with what we would have had to have done using the Record Reconstruction Table, we can see that one advantage is that we don’t have to chase pointers through columns that aren’t involved in the query (a fact that could be useful in implementing projection operations, for example). On the other hand, the Permutation and Inverse Permutation Tables together occupy twice as much space as the Record Reconstruction Table does.


Go Faster!


Having said all of the above, let me now say that for definiteness I’ll assume an implementation from this point forward that does do reconstruction via the Record Reconstruction Table, not via the Permutation and Inverse Permutation Tables (barring explicit statements to the contrary). In other words, I’ll assume the Permutation and Inverse Permutation Tables aren’t kept around at run time.

5.6

Pointers are Field Value Surrogates

Consider Fig. 5.3, a repeat of Fig. 4.3 from Chapter 4, which shows the Field Values Table for the suppliers file of Fig. 4.1 together with a corresponding Record Reconstruction Table; more specifically, consider the Field Values Table in that figure. Clearly, the position—that is to say, the row number—of any given field value within its containing column in that table serves as a unique encoding, or surrogate, for the value in question (in other words, the table provides an encoding mechanism for its values). For example, consider the CITY column, which contains, in sequence, the city names Athens, London, London, Paris, and Paris; clearly, the corresponding row numbers 1, 2, 3, 4, 5 can be regarded as surrogates for those values (in sequence as indicated).4 What’s more, those very same row numbers can also be regarded as surrogates for the supplier numbers S1, S2, S3, S4, and S5; the names Adams, Blake, Clark, Jones, and Smith; and the status values 10, 20, 20, 30, and 30 (in sequence as indicated in every case).

Fig. 5.3: Field Values Table of Fig. 4.2 and a corresponding Record Reconstruction Table

It follows from the foregoing that the Record Reconstruction Table can be regarded as containing such field value surrogates (and likewise for the Permutation and Inverse Permutation Tables, of course). For example, the STATUS column in the Record Reconstruction Table of Fig. 5.3 contains, in sequence, the row numbers 4, 2, 3, 1, 5. These row numbers are surrogates for CITY values (not STATUS values); they stand for the values Paris, London, London, Athens, and Paris, respectively, and this sequence is the sequence in which the city names will appear if we ask to see suppliers in status sequence, thus: SELECT S.S#, S.SNAME, S.STATUS, S.CITY FROM S

ORDER BY STATUS ; So now we know that row numbers serve as surrogates for field values, and the Record Reconstruction Table in particular contains such surrogates. This alternative perspective is occasionally useful, as we’ll see in Part III of this book. Now, in the TR model as I’ve described it so far (and indeed as I’ll continue to describe it throughout the remainder of this book), the surrogates in question are always row numbers. But other surrogate schemes are possible and could be useful in different implementation environments—and so such alternative schemes are yet another illustration of the fact that the TR model is capable of many different concrete implementations. Further details are beyond the scope of this book.


Go Faster!

5.7


The Field Values Table is a Directory

In this section, I want to consider (briefly) another alternative perspective that can also be helpful on occasion. Consider the Field Values Table in Fig. 5.3 once again; more specifically, consider the CITY column in that table. Let c be a value (city name) in that column, and let the containing cell C be cell [i,4] (meaning city c has surrogate i). Then that subscript [i,4] identifies the unique cell, C' say, in the Record Reconstruction Table that corresponds to cell C in the Field Values Table. That cell C' in turn is part of a zigzag that allows a record containing the CITY value c to be reconstructed. All of the foregoing should really be familiar to you by now, and I mention it here mainly by way of review. However, let me now point out something that I deliberately haven’t mentioned before: namely, that each column of the Field Values Table effectively serves as a kind of directory—an index, almost!—to the Record Reconstruction Table and thence, eventually, to the corresponding records. For example, consider the city name Athens, which appears in cell [1,4] of the Field Values Table. Following the zigzag through cell [1,4] of the Record Reconstruction Table, we obtain the record :

(More precisely, we obtain a version of this record in which the left-to-right field ordering is CITY, then S#, then SNAME, then STATUS.)


SetaPDF

SETASIGN




Go Faster!


Please note carefully, however, that though we might indeed say that each column of the Field Values Table is “almost an index,” it certainly isn’t an index in the conventional sense of that term. Perhaps a better way to put it would be to say that the column in question—together with the Record Reconstruction Table, which is certainly needed too—provides the functionality of an index. That is, the column in question and the Record Reconstruction Table together provide indexing functionality (both direct- and sequential-access functionality) on the user file on the basis of values of the corresponding field.

5.8

Miscellaneous Implementation Alternatives

I’d like to close this chapter by briefly mentioning a few miscellaneous points regarding alternative implementation possibilities for various other TR constructs. • I’ve been talking so far as if the linkage information that ties together the field values for a given record must be implemented as a pointer ring or zigzag specifically. But other possibilities exist. For example, we could replace each such ring (within any given Record Reconstruction Table) by two subrings that are connected by means of some common “bridging” column. In the case of suppliers, for example, we might have one subring connecting S# and CITY and another connecting S#, SNAME, and STATUS (column S# being the bridging column, in this particular example). Such an arrangement would be advantageous if projection over S# and CITY is a frequently requested operation—in other words, if queries of the form SELECT S.S#, S.CITY FROM S are common, in SQL terms. What’s more, the pointers in such rings or subrings could be either one-way (as I’ve been assuming so far) or two-way. Other options are also available; one such will turn out to be important in connection with disk-based implementations, and I’ll discuss it in detail in Chapter 14. • I’ve also been talking so far as if every column in the Field Values Table has to be maintained in sorted order. In practice, however, such is not the case; there might well be some columns for which such sorting is just not worth the overhead. An example might be a text column in which the entries are natural-language comments. • Furthermore, those columns that are sorted don’t all have to be sorted in the same way. In our examples, I’ve shown all columns sorted in ascending sequence. However, it might be better to keep some columns in descending sequence instead; and in the case of columns defined over a user-defined data type, the sort order might be defined in terms of a user-defined “ Apply now redefine your future

- © Photononstop



19/12/13 16:36



Go Faster!

File Banding

• First of all, I need to own up to a slight terminological inexactitude on my part. When I first mentioned banding in the introduction to this chapter (also in Chapter 11), I said we decomposed the file into horizontal subfiles called bands, and so “band” was a concept at the file level. As you might have noticed, however, I subsequently started talking about bands fitting into memory at run time, and so “band” became a concept at the TR level (or possibly at some lower and more physical level still). However, it’s very convenient to be able to use the same term band at these different levels of abstraction within the overall system, and—trusting that the practice won’t cause confusion—I intend to continue doing the same thing throughout the remainder of this chapter. • Next, observe that the Record Reconstruction Table for any given band does indeed involve pointers that are local to the band in question. In the figures I’ve stressed this fact by using 1‑4 as the sole legal pointer values for band one, 5‑8 as the sole legal pointer values for band two, and 9 as the sole legal pointer value for band three. In practice, of course, pointer values need only be unique within the relevant band (since the whole object of the exercise is not to have pointers out of one band into another). • Precisely because pointers need now be unique only within the relevant band instead of within the entire file, they need fewer bits than they did before (before we did the banding, I mean). To revert for a moment to the example of a file with ten million records: Without banding, pointers are 24 bits, as we know; with banding, however, if one band corresponds to 80,000 records as suggested above, then pointers need be only 17 bits instead of 24—another significant space saving (and, be it noted, one that applies to the large Record Reconstruction Table specifically, a most gratifying state of affairs). Note: These facts explain why the figure of 80,000 rows per band quoted earlier in this section was too low. A more reasonable figure would be 115,000 or so (making the total number of bands 85 or so instead of 125). • Note that banding does suffer from the drawback that it potentially introduces a degree—a tiny degree—of redundancy into the stored data. Without banding (but with condensing), no field value ever appears more than once in the Field Values Table. With banding, however, the same field value might simultaneously appear in several distinct bands (though never more than once per band). The WEIGHT value 12.0 is a case in point in Fig. 13.5: It appears in both band one and band two. Note: Such redundancy could even apply to the characteristic field, if values of that field aren’t unique (clearly this can’t happen in the example, though, because {P#} is a key—in fact, the only key—for the parts relation). More important, however, note that this particular drawback (the possibility of a tiny amount of redundancy, that is) disappears anyway if the approach to be described in Section 13.4 is adopted. • Another drawback, perhaps more serious than the previous one, is the following: Since we no longer have just one Record Reconstruction Table for the entire file, it follows a fortiori that we can’t have a “preferred” Record Reconstruction Table for the entire file (as described in Chapter 7) that provides major-to-minor orderings over all of the fields. However, it’s at least true that each of the local Record Reconstruction Tables can be a “preferred” one so far as the records that belong to the band in question are concerned. The Record Reconstruction Tables of Fig. 13.6 are preferred ones in this sense.


Go Faster!

File Banding

• Equality or range queries based on the characteristic field can now be handled very efficiently by going directly to the relevant band or bands. (I’m assuming here that the system will keep certain metadata in memory, saying, for example, that band one has information regarding parts with part numbers in the range P1‑P4, band two has information regarding parts with part numbers in the range P5‑P8, and so on.) Even if several bands are involved, each can be processed independently of the rest (possibly even in parallel). And, of course, once a given band has been streamed into memory, record reconstruction within that band is a purely in-memory operation. Among other things, therefore, banding can provide functionality analogous, somewhat, to that provided by a conventional clustering index on the characteristic field (see Chapter 2 if you need to refresh your memory regarding clustering indexes). • Note finally that banding does not have to be done “by hand”; rather, it can be done automatically during the load process (like factoring in the previous chapter), using built-in heuristics and statistical data analyses that are also done at load time. In other words, the benefits of banding, like those of factoring, can be obtained automatically, without any need for human decisions (except as noted in Section 13.5 below). A Small Digression You might have noticed something interesting has happened to band three in the example. Band three corresponds to a single record; it therefore contains a Field Values Table of just a single row and a Record Reconstruction Table of just a single row. Observe now that: • In the case of the Field Values Table—ignoring the row ranges, which are clearly pretty pointless here anyway—the single row is effectively a direct-image representation of the record in question: namely, the “large-file” record for part P9 (see Fig. 13.1). • In the case of the Record Reconstruction Table, the single row contains a “zigzag” that’s in fact a straight line—necessarily so, of course. But a zigzag that’s a straight line isn’t all that useful, because the pertinent record can easily be “reconstructed” without it. To be specific, the field values of that record are now linked by physical contiguity in the Field Values Table (or something that might be thought of as akin to physical contiguity, at any rate). It should be clear from the foregoing discussion that if the band size were such that every band corresponded to a single record, then we would be getting rather close to a direct-image representation of the entire file. It’s interesting to observe, therefore, that—in a sense—the conventional direct-image style of representation might be regarded as just a highly suboptimal special case of the much more general TR style of representation. I’ll leave this observation as something for you to meditate on at your leisure.


Go Faster!

13.3

File Banding


Let’s get back to the main thread of our discussion. We’ve seen that, with banding, equality and range queries based on the characteristic field—the P# field, in the example—can be handled very efficiently, because the implementation can go directly to the relevant band or bands, stream it or them into memory, and complete processing of the query as a pure in-memory operation. But what about queries based on some other field? For example, consider the following SQL query: SELECT DISTINCT P.P# FROM P

WHERE P.WEIGHT = 12.0 ; As I pointed out in the previous section, the WEIGHT value 12.0 appears in both band one and band two. In the worst case, of course, the same WEIGHT value could appear in every band, precisely because the original file was sorted on P#, not WEIGHT. Now, it might at least be possible for the implementation to know, from the Field Values Table(s), just which bands a given value does in fact appear in; but if it doesn’t (and possibly even if it does), a query like the one just shown will effectively require a scan of the entire file, and performance might thus be poor. As noted in Chapter 11, in other words, the symmetric performance property is lost. One way to address this problem is to band the original file twice, once using P# as the characteristic field and once using WEIGHT. In this way, we can have one set of banded Record Reconstruction Tables corresponding to the P# sort order and another set corresponding to the WEIGHT sort order: a form of controlled redundancy (see Section 13.5 for further discussion). Figs. 13.7, 13.8, and 13.9 show, respectively, the banded version of the file, the Field Values Tables for those bands, and Record Reconstruction Tables for those bands, if we sort and band by part weight as suggested (more precisely, by part number within part weight). Note: I’m still assuming four records per band, of course.

Fig. 13.7: Banding parts by weight


Go Faster!

File Banding

Fig. 13.8: Field Values Tables for the bands of Fig. 13.7

Fig. 13.9: Record Reconstruction Tables for the bands of Fig. 13.7

The query SELECT DISTINCT P.P# FROM P

WHERE P.WEIGHT = 12.0 ; can now be implemented by going directly to band one (only) in the foregoing banding, and symmetry of performance is restored.


Go Faster!

13.4

File Banding

How it's Really Done

Now I need to clean up my act ... I said in Section 13.1 that we build a separate Field Values Table and Record Reconstruction Table for each band in the banded file. In fact, however, that statement isn’t quite accurate. What we really do is this: First, we build a single Field Values Table for the entire file in the usual way; then, for each band, we build a band-local “Field Values Table” (or an analog of such a table, rather) that contains, not field values as such, but rather pointers into the overall Field Values Table for the whole file. Let’s see how this works out in our example. First, Fig. 13.10 (a copy of Fig. 13.2) shows the Field Values Table for the entire “large file” from Fig. 13.1:

Fig. 13.10: Field Values Table for the large file of Fig. 13.1 (same as Fig. 13.2)






Go Faster!

File Banding

Let’s assume once again that we want to sort and band on part number. Here then (repeated from Fig. 13.4) is the first band:

Here’s the Field Values Table for this band as given in Fig. 13.5:

And here’s the corresponding analog of this Field Values Table with pointers into the main Field Values Table of Fig. 13.10 instead of actual field values (for convenience, I’ve extracted the corresponding Record Reconstruction Table from Fig. 13.6 and shown it on the right):


Go Faster!

File Banding

Here for completeness are the Field Values Table analogs and Record Reconstruction Tables for the other two bands:

By way of example, let’s consider the problem of reconstructing the record for part P6, say. The sequence of events is as follows: • First we perform an in-memory look-up for part P6 in the Field Values Table of Fig. 13.10, and we discover that the record we want passes through cell [6,1] of that table. • Knowing that part P6 falls into band two, we adjust that [6,1] to [2,1] to account for the fact that band one contains four parts. Note: Actually this step is unnecessary in our example, because I’ve numbered the rows 1‑4 within band one, 5‑8 within band two, and 9 within band three, and so we already know—albeit unrealistically—that [6,1] refers to a cell within band two. For definiteness and clarity, I’ll continue to rely on that unrealistic assumption that row numbers are globally unique as shown in the figures. • We stream band two into memory if it’s not already there. • Next, we follow the zigzag passing through cell [6,1] of the Record Reconstruction Table in band two (an in-memory process). That zigzag looks like this: [6,1], [6,2], [7,3], [5,4] We use in-memory look-ups to determine that the pointers (row numbers) in the corresponding cells of the corresponding Field Values Table analog are: 6, 3, 5, 1


Go Faster!

File Banding

• We therefore go to cells [6,1], [3,2], [5,3], [1,4] of the Field Values Table of Fig. 13.10 (another in-memory process). The corresponding values are: P6, Cog, 19.0, cc1 Reconstruction of the desired record is now complete. At this point I’d like to remind you of something. In Chapter 5 (Section 5.6), I pointed out that row numbers can be regarded as surrogates for field values. In the record reconstruction example just now, for instance, the sequence of row numbers 6, 3, 5, 1 can be regarded as surrogates for the sequence of field values P6, Cog, 19.0, cc1 Thus, we might reasonably think of the band-local Field Values Table analogs as containing, not actual field values as such (as indeed we now know), but surrogates for such field values instead.



Go Faster!

13.5

File Banding


In Section 13.3, I said that banding the parts file twice, once on part number and once on part weight, amounted to a form of controlled redundancy. Now, as I’m sure you know, redundancy in what’s stored is usually considered to be a bad thing—not least because it can lead to inconsistencies. However, it’s only when the redundancy is uncontrolled that it’s unquestionably bad. Controlled redundancy—in other words, redundancy that’s deliberately introduced and properly managed—is (or can be) fine; indeed, there are many sound reasons, both business and technical reasons, for storing several copies of the same data. But it does need to be understood that “controlled” here means that a) The DBMS must be aware of the redundancy if it exists, and more particularly that b) The DBMS must take responsibility for “propagating updates” and maintaining data consistency (in other words, the redundancy must effectively be hidden from the user). I’ll come back to this point in a few moments. Let’s return for a moment to the example from Section 13.3. Banding the parts file twice as suggested in that section clearly means we’re going to need twice as much storage space. But I remind you that the data is already highly compressed; typically, as I pointed out in the introduction to this chapter, the TR representation requires only some 20 percent of the space required for a direct-image representation of the same data. So we can afford to band and store the original file five different ways and still not require any more storage than a conventional system does—and that’s before the storage for indexes and other auxiliary structures is taken into account, in the direct-image case. Note: The point is worth making that indexes and the like effectively constitute a form of controlled redundancy in conventional systems anyway. And I’ve already mentioned the amount of storage space that kind of redundancy can involve (as noted in Section 13.1, a further fivefold increase is not at all atypical). The kind of redundancy we’re talking about in TR, then, is (to repeat) only redundancy on top of something that’s already highly compressed. What’s more, it’s only redundancy on top of that portion of the data that can’t be handled by the factoring techniques of Chapter 12. And what’s more again, it’s the right kind of redundancy. It’s not the field values that are stored redundantly; rather, it’s the linkage information. (No field value is ever stored more than once on the disk—assuming, of course, that column condensing and merging is done, as it certainly will be in a disk implementation. Contrast the situation in a direct-image system, with its indexes and other auxiliary structures, where it’s virtually guaranteed that the very same field values will be stored many, many times over.) Storing the linkage information in different ways in a diskbased system is precisely what lets us achieve symmetry of performance in such a system. Note, moreover, that it’s a comparatively straightforward matter to decide what redundancies to store (in other words, to decide what sortings and bandings should be done). Detailed knowledge of the internal workings of the system is not required; nor is detailed knowledge of exactly the kinds of queries that users will submit. All that’s needed is a general sense as to which fields are the pragmatically important ones—and this knowledge could even be obtained by the system itself, by analyzing actual or typical query sequences. Of course, if the system doesn’t determine for itself what sortings and bandings are desirable, then the database administrator will have to tell it; in other words, human decisions will be required. But (to repeat) I don’t think the decisions in question are very difficult ones.


Go Faster!

File Banding

Now, the obvious drawback to storing data redundantly is the impact it’s likely to have on updates: If N distinct copies are stored of some given data item X, then an update to any one of those copies must be propagated to all the rest. But this is a much more tractable problem in TR than it usually is for at least two reasons: • First, I’ve already said that field values as such aren’t stored redundantly (so that “data item X” in my example just now can’t be a field value in TR). Note in particular that update propagation can’t affect index entries in TR, because there aren’t any index entries in TR. Thus, update propagation is primarily a question of maintaining the pointers that are used in record reconstruction. • Second, updates in the real world typically affect only a tiny portion of the overall database, as explained in Chapter 6. Typically, TR exploits this fact by keeping most of the database static for most of the time and segregating all updates in a much smaller overflow structure of their own (see Chapter 6, Section 6.5). That overflow structure is thus the only portion of the database that needs to be maintained in real time, and hence the only place where anything like update propagation has to be done in real time. One last point in connection with redundancy in TR: Even if the database as stored does involve redundancy in the form of different bandings, there’s no need to copy all of those different bandings to backup storage every time a full database backup is to be taken. All that’s necessary is to copy just one of the bandings (the others can be recreated from that one).

Endnotes 1. To a first approximation yet again. In fact, as we saw in Chapter 11 (Section 11.5), there are compression techniques that do still work, even on that large table. For simplicity, however, I won’t attempt to incorporate any of those techniques into my examples in this chapter. 2. This is not an overstatement. For example, in a report on the performance of a certain well-known SQL product on the standard TPC‑H benchmark, reference [67] shows a raw data set of three terabytes expanding out to occupy nearly 60 terabytes of disk space, a twentyfold increase. 3. Contrast factoring, where we decompose files vertically (see the previous chapter). Indeed, just as there are certain parallels between factoring and conventional projection/join normalization, so there are certain parallels between banding and what might be called restriction/union normalization. Restriction/union normalization is a logical design technique, not much researched at the time of writing and certainly not yet much used in practice, in which the decomposition operator is restriction and the recomposition operator is union [32].


Go Faster!

Stars and Zigzags

14 Stars and Zigzags 14.1 Introduction This is the last chapter in this part of the book. In it, I want to describe a rather different approach to the problem of implementing the TR model on disk: more specifically, to the problem of minimizing disk seeks. Note immediately, therefore, that the approach in question can be regarded in part as an alternative to file banding as discussed in Chapter 13—but only in part, because in fact file banding can be used in combination with the approach to be described, as we’ll see in Section 14.4. Note too that, as with the discussion of file banding in Chapter 13, we’re primarily concerned here with how to deal with the “large file” that remains after file factoring has been used to get all of the “small files” into memory. But first things first. As we know, the basic problem with TR on the disk is that if we’re not careful, the zigzags can splay out all over the disk. Well, if the splay problem is caused by the zigzags, then let’s get rid of the zigzags! Recall from Chapter 5 (Section 5.8) that the linkage information that lets us reconstruct records doesn’t have to be implemented as zigzags specifically—other possibilities exist, with (of course) different performance characteristics. The approach to be described in this chapter exploits this idea; essentially, what it does is replace the zigzags by a different kind of structure called a star.



Go Faster!

Stars and Zigzags

Let me illustrate this idea right away. Fig. 14.1 shows the Field Values Table and corresponding Record Reconstruction Table from Figs. 13.2 and 13.3 in Chapter 13—except that, for pedagogic reasons, I’ve shown the Field Values Table in uncondensed form. Fig. 14.2 then highlights one particular zigzag from Fig. 14.1 (actually the one for part P7), and Fig. 14.3 shows what happens if we replace that zigzag by a star.

Fig. 14.1: Uncondensed Field Values Table and corresponding Record Reconstruction Table

Fig. 14.2: Zigzag for part P7


Go Faster!

Stars and Zigzags

Fig. 14.3: Star for part P7 (with P# the core)

As you can see, where Fig. 14.2 has a ring of pointers (implemented within the Record Reconstruction Table and conceptually superimposed on the Field Values Table), Fig. 14.3 has a star of pointers instead. Cell [7,1], which corresponds to the P# value P7, serves as the center or core of that star. Three pointers emanate from that core and point to cells [6,2], [8,3], and [4,4], respectively; those cells correspond to the PNAME value Nut, the WEIGHT value 19.0, and the CC# value cc1, respectively. Those three pointers, which (as Fig. 14.3 indicates) are all two-way and can therefore be traversed in either direction, serve as the spokes or rays of the star. Now, the star in the figure clearly does support reconstruction of the record for the part in question (part P7). To be specific: a) If we start at the core, we can simply follow the three spoke pointers outward to obtain the other three field values. b) If we start at any other point, we can follow the corresponding spoke pointer inward to the core and then proceed as under a) above—with the exception that, if we get to the core by following spoke pointer sp inward, then of course there’s no need to follow that particular spoke sp outward again. Note: As a matter of fact, we never need to follow a spoke outward from the core within the Record Reconstruction Table as such; we only need to be able to go from the core outward to cells within the Field Values Table. Now, you might have already realized that, for any given zigzag, there are several distinct but equivalent stars—it just depends on which field we choose as the core. I’ll return to this point in Section 14.3. You might also have realized that the record reconstruction algorithm as just outlined displays asymmetric performance—access via the core field will be faster than access via any other field, because stars (unlike zigzags) are an inherently asymmetric structure—and I’ll return to this point in Section 14.5. The structure of the chapter is as follows. Following this introductory section, Section 14.2 gives a simple example to illustrate the basic ideas behind star structures. Section 14.3 elaborates on and generalizes that example. Section 14.4 shows how the ideas from the first three sections work on the disk (those previous sections are principally concerned with a memory-based implementation only). Finally, Section 14.5 discusses the use of controlled redundancy in connection with star structures.


Go Faster!

14.2

Stars and Zigzags

A Simple Example

As in the previous chapter, the basic problem we’re trying to deal with is how to get the best possible performance out of the “large” Record Reconstruction Table in a disk-based system. So I’ll base my discussions on the same running example as in that previous chapter; to be specific, I’ll assume once again that we’ve factored the parts file into large and small files that look like this: Large file P# PNAME WEIGHT CC#

Small file CC# COLOR CITY

However, we’re interested here in the large file exclusively. Fig. 14.4 shows a sample value for that file (extracted from Fig. 13.1 in Chapter 13). And we’ve already seen a Field Values Table and a zigzag-based Record Reconstruction Table for that file in Fig. 14.1 above. Note: While the file shown in Fig. 14.4 is obviously not very large, let me remind you that we’re really supposed to be dealing with files of millions or even billions of records, and the data in those files isn’t supposed to display any “statistical clumpiness” at all.

Fig. 14.4: Sample file

Now, despite the fact that we’re really supposed to be talking about a disk implementation, it’s convenient to pretend for the time being that everything’s in memory, and I’ll adopt that pretense until further notice. So how do we proceed? Well, since (as we’ve already seen) stars are asymmetric, the first thing we have to do is decide what the core’s going to be; in other words, we first have to choose a core field (much as we had to choose a characteristic field in connection with with banding in the previous chapter).1 Suppose we choose field P#. Then Fig. 14.5 shows a corresponding star-based Record Reconstruction Table for the file of Fig. 14.4. Note: From this point forward, for convenience, I’ll abbreviate the term “star-based Record Reconstruction Table” to just star table, and similarly for zigzag table.


Go Faster!

Stars and Zigzags

Fig. 14.5: Star-based Record Reconstruction Table for the file of Fig. 14.4 (with P# the core)

In order to explain the star table of Fig. 14.5, let’s go back for a moment to the zigzag table of Fig. 14.1. Consider the zigzag for (say) part P7, which—as Fig. 14.2 shows graphically—looks like this: [7,1], [6,2], [8,3], [4,4]



Go Faster!

Stars and Zigzags

In a star analog of this zigzag, therefore, the core cell [7,1] will contain the pointer triple 6‑8‑4 (these are the outward portions of the spokes) and cells [6,2], [8,3], and [4,4] will each contain a 7 (these are the inward portions of the spokes). And similarly, of course, for all of the other stars in the table. Note: I’ve expanded the heading of column P# in Fig. 14.5 to show which pointers are which. To be specific, in the triple n‑w‑c, n is the PNAME (name) pointer, w is the WEIGHT pointer, and c is the CC# pointer. Observe, incidentally, that it’s a consequence of the way the star table in the example is defined that: a) The first “subcolumn” within column P# of the star table—the one for PNAME, labeled n in the figure—is identical to column P# of the zigzag table (why, exactly?); b) Column CC# of the star table (the last column) is identical to column CC# of the zigzag table (again, why exactly?). Now let’s consider how the star table of Fig. 14.5 might be used to implement queries. Consider the following simple example: SELECT DISTINCT P.P#, P.WEIGHT FROM P

ORDER BY P# ; This query refers to relation P, but of course it can be implemented by accessing the large file only—we2 don’t need to touch the small file at all. (This is a generic observation, and I won’t bother to repeat it in subsequent examples.) So what we have to do is this: • Traverse column P# of the star table top to bottom to obtain the result in the desired ordering. • The first cell encountered, cell [1,1], corresponds to cell [1,1] in the Field Values Table, which (as Fig. 14.1 shows) contains the part number P1. • That same first cell in column P# of the star table contains the pointer triple 5‑1‑1. The first pointer in this triple corresponds to a part name, the second to a weight, and the third to a CC# value. However, the query isn’t interested in part names or CC# values, so we can go just to the WEIGHT cell in the Field Values Table—which is to say cell [1,3]—to obtain the desired weight value. The first result record has now been constructed. • All other result records are constructed analogously. Note: If the query had specified ORDER BY WEIGHT instead of ORDER BY P#, we would have accessed the star table by column WEIGHT instead of column P#. For each cell encountered, we would have followed the inward pointer to the corresponding core cell and then used that core cell to construct the corresponding result record as above. One point that emerges right away from the foregoing is that stars might be better than zigzags for implementing projections. To be specific, if the file has M fields, then (in general) zigzags require M accesses to the Record Reconstruction Table for each result record no matter how many fields are requested, while stars require at most two (and often only one). This fact makes stars attractive, because it’s quite rare in practice for a query to request all of the fields of the file (or all of the attributes of the relation, rather).


Go Faster!

Stars and Zigzags

Here’s another sample query: SELECT DISTINCT P.P# FROM P

WHERE P.WEIGHT = 19.0 ; Here we do a binary search on column WEIGHT of the Field Values Table and determine that the records we want pass through cells [7,3] and [8,3] of the star table. Then we use the stars corresponding to those cells to construct the desired records. Before closing this section, I should draw your attention to one more point: namely, that a star table will always be bigger than its zigzag analog. Again suppose the file has M fields. Then the zigzag table will have M pointers per record, but the star table will have 2(M-1)—so if M is large, the star table will be almost twice the size of the zigzag table. Note: I’m assuming here that M is greater than one. What happens if that assumption is invalid?

14.3


Now let’s see what happens if we choose a field other than one corresponding to some key as the core field. Let’s choose field WEIGHT. Fig. 14.6 shows what happens to the star for part P7 under this assumption; Fig. 14.7 shows the corresponding star table in its entirety. Observe that: a) The first “subcolumn” within column WEIGHT of the star table of Fig. 14.7—the subcolumn for CC#, labeled c in the figure—is identical to column WEIGHT of the zigzag table of Fig. 14.1; b) Column PNAME of the star table of Fig. 14.7 is identical to column PNAME of the zigzag table of Fig. 14.1.

Fig. 14.6: Star for part P7 (with WEIGHT the core)


Go Faster!

Stars and Zigzags

Fig. 14.7: Star table for the file of Fig. 14.4 (with WEIGHT the core)

Observe too that (to spell out the obvious) the star tables of Figs. 14.5 and 14.7 are different. Thus, while we might reasonably talk about “the” zigzag table for a given file, we can’t sensibly talk about “the” star table for that same file; instead, we have to talk about the star table that corresponds to the given file together with some given core field. Now I want to make another point. Suppose we use the star table of Fig. 14.7 to reconstruct the entire file; suppose for definiteness that we perform this process using column PNAME (that is, we traverse column PNAME of the star table top to bottom). Here’s the reconstruction process spelled out in detail:

93%



5 Specializations




[email protected]



55 Nationalities

in class



Go Faster!

Stars and Zigzags

• From cell [1,2] of the star table, follow the spoke inward to the corresponding core cell [5,3]. • Go to the Field Values Table and extract the WEIGHT value in cell [5,3] (from Fig. 14.1, we see the value in question is 17.0). • Cell [5,3] of the star table contains the pointers 5, 2, and 1. Go to the Field Values Table and extract the CC# value in cell [5,4], the P# value in cell [2,1], and the PNAME value in cell [1,2]. Those values are cc2, P2, and Bolt, respectively, and we have now constructed the first result record. Note: Alternatively, of course, we could have obtained the PNAME value (Bolt) in the first step by going straight to cell [1,2] of the Field Values Table (since we started out with cell [1,2] of the star table in the first place). Performing this sequence of steps eight more times but starting successive iterations with cells [2,2], [3,2], ..., [9,2] of the star table (for the second, third, ..., ninth record in the overall file reconstruction process), we obtain the result shown in Fig. 14.8. That result, as you can see by inspection (or by comparison with Fig. 13.1 in Chapter 13), consists of the original nine part records ordered by weight within name (more precisely, ordered by P# within CC# within WEIGHT within PNAME). In other words, the star table of Fig. 14.7 is a “preferred” one in the sense of Chapter 7. (So too is the star table of Fig. 14.5, as a matter of fact.)

Fig. 14.8: Part records ordered by WEIGHT within PNAME

One last point to close this section: Consider, by way of example, cell [8,3] of the star table in Fig. 14.7. That cell corresponds directly to cell [8,3] of the Field Values Table in Fig. 14.1, which contains the weight 19.0. That same cell [8,3] in the star table contains the pointers 4, 7, and 6, which take us to cells [4,4], [7,1], and [6,2] of the Field Values Table, and those cells in turn contain the CC# value cc1, the part number P7, and the name Nut, respectively. In a sense, therefore, that star table cell [8,3] can be thought of, all by itself, as a digitized version of the entire record for part P73—its position within the star table effectively specifies one component of that record (the WEIGHT component), and its contents effectively specify the other three components (the CC#, P#, and PNAME components). Note: It’s relevant to mention once again that—as we first saw in Chapter 5, Section 5.6—pointers to Field Values Table cells can usefully be thought of as surrogates for the field values contained within those cells.


Go Faster!

14.4

Stars and Zigzags

What Happens on Disk

Now let’s drop the pretense that everything’s in memory and see how the ideas we’ve been discussing work out in a disk environment—by which I mean, primarily, an environment in which the star table is too big to be memory-resident. Note: It’s easier to talk in terms of just one star table at a time; in what follows, therefore, I’ll pretend there is indeed just one such table (barring explicit statements to the contrary), and I’ll refer to it as “the” star table. The first point is that, in the case of the core column in particular, we’re probably going to want to extend the columnwise storage idea to store each subcolumn of that column as an independent column in its own right. The reason is that (as we saw in Section 14.2) we usually don’t need to access all of those subcolumns in implementing any given query, and we certainly don’t want to read anything into memory that we don’t need, if we can help it. Fig. 14.9 shows the star table of Fig. 14.5—the one based on P# as the core field—with the core column divided up into separate columns in this way. Note: The term subcolumn, which I’ve now used several times, is (I hope) self-explanatory. From this point forward, I’ll use the term core column to mean the combination of all pertinent subcolumns.

Fig. 14.9: Star table of Fig. 14.5 with core column subdivided

The table of Fig. 14.9 will be stored on disk as six separate columns, and the implementation will thus be able to go directly to the start of any of those stored columns at any time. What’s more, those six columns will also be stored in consecutive pages on the disk (the first column in one set of pages, the second in the immediately following set, and so on), with a view to minimizing seek activity and allowing successive columns to be streamed into memory at run time. (Of course, it’s highly desirable for the pertinent core subcolumns to be in memory at run time—where by “pertinent” I mean the ones that are needed for any given query—even if the star table overall is too big to fit into memory in its entirety.)


Go Faster!

Stars and Zigzags

So what’s good about this arrangement from a performance point of view? Well, it certainly means that pointers in any given column of the star table will be physically contiguous on the disk, with the implication that traversing such a column will be fast. And since these remarks apply to subcolumns of the core column in particular, it follows that doing record or file reconstruction via the core column will not lead to the splay problem. What’s more, so long as the core column is in memory, doing reconstruction via any other column will be efficient too; but if the core column is too big to fit into memory, then reconstruction via any other column still has the potential to be extremely inefficient. However, at least we don’t have to deal with “splayed zigzags” as such. So what can we do if the core column is too big to fit into memory? One approach would be to use banding, more or less as described in the previous chapter. Banding, as you’ll recall, is a divide-and-conquer technique that works by dividing the file up into horizontal subfiles or bands such that (a) each band fits into memory and (b) no pointing occurs between bands. Refer to Chapter 13 for further discussion of this possibility. The section immediately following describes a different approach to the same problem.

14.5


We’ve seen that, in order to define a star table, the first thing we have to do is choose a core field.4 Now, when we were discussing banding in Chapter 13, the choice of characteristic field had major performance implications, because that choice effectively dictated the stored sort order for the file. And the situation is similar (though not identical) with a star table and its core field; again our choice can have major performance implications. To be more specific, if we choose C as the core field, then queries that involve access to the file in sequence by values of C will be very efficient, but (as explained near the end of the previous section) queries that involve access in any other sequence might not be. The obvious solution to this problem is to introduce some form of controlled redundancy once again. I discussed controlled redundancy at some length in Chapter 13 (Section 13.5); most of the points I made there apply here too, and I won’t bother to repeat them all now. Let me just remind you yet again that the TR representation of a given data set typically requires only some 20 percent of the space required for a direct-image representation; as a result, we can afford to store the data up to five different ways without taking up any more space than a conventional system would need (and that’s just for storing the raw data alone, in that conventional system). So let’s consider the question of redundancy in the context of star tables specifically. Clearly, the simplest thing to do is to store several different star tables for the same file, each one based on a different core field. For example, we could store both the star table of Fig. 14.5 (based on P#) and the star table of Fig. 14.7 (based on WEIGHT), and thereby achieve symmetric performance for access to parts based on P# and access to parts based on WEIGHT. An alternative approach would be to store just one star table but to expand that table to include, in addition to what I’ll now call the primary core column, a set of secondary core columns as well. Each such secondary core column will contain essentially the same information as the primary one, but sorted into a sequence that matches the sort sequence of some field that’s not the (primary) core field.


Go Faster!

Stars and Zigzags

To see how this idea works out in practice, let’s work through an example. Consider column WEIGHT in the star table of Fig. 14.9. As you can see, that column contains the following pointers (row numbers) in top-to-bottom sequence: 1, 5, 4, 8, 2, 3, 6, 7, 9 This sequence is in fact the permutation that corresponds to the specification ORDER BY WEIGHT, CC#, P#, PNAME; it means, for example, that record 8 of the file—see Fig. 14.4—appears in position 4 in that ordering. It follows that if we were to process the core column of that same star table by taking the first cell first, the fifth cell second, the fourth cell third, and so on, we would reconstruct a version of the file that was in exactly that ordering. The trouble is, of course, that processing the core column in the way just indicated could lead to a lot of seek activity on the disk. So why not store another copy of that core column that’s rearranged into exactly the sequence we want? The result might look like this: 5-1-1 2-2-8 7-3-2 9-4-9 1-5-5 8-6-6 3-7-3 6-8-4 4-9-7 If such a column is added—redundantly, of course—to the star table of Fig. 14.5, we’ll be able to reconstruct the desired file from that table in the desired order without all of that seek activity on the disk, precisely because that new column will be stored as a separate column in its own right on the disk. (Well, actually it’ll be stored as three separate columns, one for each subcolumn, but that detail need not concern us here.) But wait a moment ... Recall that within a given triple of pointers n‑w‑c in the primary core column, n is the name pointer, w is the weight pointer, and c is the CC# pointer. Clearly there’s no need to include the w pointers in the secondary core column, because the value of the w pointer in the ith cell will always simply be i (as you might have already noticed). Thus, the final version of the star table with both a primary core column and a secondary core column will look as shown in Fig. 14.10 (note the column labels 1n, 1w, etc.).


Go Faster!

Stars and Zigzags

Fig. 14.10: Star table with primary (P#) and secondary (WEIGHT) cores

Now suppose we want to reconstruct the part record passing through some particular cell in the WEIGHT column of the star table of Fig. 14.10—let’s say (arbitrarily) the fourth cell, which is still cell [4,3] according to the column labeling shown in the figure. The sequence of events is as follows. • Go to cell [4,3] of the Field Values Table of Fig. 14.1 and extract the value stored there (weight 15.0).






Get your free trial





Go Faster!

Stars and Zigzags

• Cell [4,3] of the star table contains the row number 8, so the part number of the record we want is in cell [8,1] of the Field Values Table. Go and extract it (part number P8). • Within the secondary core column, go to the PNAME cell corresponding to WEIGHT cell [4,3]. That PNAME cell is cell [4,3n], and it contains the row number 9. Go to cell [9,2] of the Field Values Table and extract the name (Wheel). • Within the secondary core column, go to the CC# cell corresponding to WEIGHT cell [4,3]. That CC# cell is cell [4,3c], and it also contains the row number 9. Go to cell [9,4] of the Field Values Table and extract the CC# value (cc5). Record reconstruction is now complete. Here’s the storage arithmetic again. Suppose once again that the file has M fields. Then: a) A zigzag table will have M pointers per record. b) A star table with a single core column will have 2(M-1) pointers per record. c) A star table with a primary core column and N secondary core columns will have 2(M-1) + N(M-2) pointers per record. Of course, Case b. is just that special case of Case c. in which N is zero.

Endnotes 1. In fact the core field is often referred to as a characteristic field. I’ll stay with the term core field in this chapter. 2. As in Chapter 10, “we” here really means the DBMS. 3. Chambers Twentieth Century Dictionary defines digitize to mean “to put (data) into digital form for use in a digital computer.” 4. It might be possible to automate that choice, but probably not if we introduce redundancy (which I’m about to do); in that case, human decisions are probably going to be needed.


Go Faster!

Part IV: Conclusion


Go Faster!

The Future Looks Bright Ahea

15 The Future Looks Bright Ahead 15.1 Introduction The goal of this book has been to present a tutorial on the TransRelational Model (the TR model, also referred to herein as TR technology, or just TR for short). As explained in the preface, the TR model represents one specific but important application of a more general technology called the Tarin Transform Method; that more general technology is suitable for building data management systems of many different kinds, but I’ve deliberately concentrated in this book on its suitability for implementing the relational model in particular. Now, in this final chapter, I want to summarize and analyze the main points from what’s gone before—especially with respect to the benefits that this exciting new technology can provide—and I also want to speculate a little as to what might lie ahead.

15.2

The TR Model Summarized

Everything I’ve said about the TR model in this book so far has been based on Required Technologies documentation (the Initial Patent [63] in particular). However, I need to make it clear that I’ve altered most of the terms, I’ve simplified many of the concepts (and even omitted a few), and I’ve imposed my own sequence on the material—always in the hope of making what I think are the really important ideas more readily understandable. Also, the strict stratification into three layers of abstraction described in Chapter 3 (and adhered to throughout the present book) isn’t explicitly called out in the Required Technologies documents, and the same is true for some of the techniques sketched in Chapter 10 and elsewhere for implementing the relational operators. Data Independence TR is a transform technology. The notion of a transform (as that term is used in the TR context) is a logical consequence of the familiar notion of data independence: It should be possible to change the way the data is physically stored without having to change the way the data looks to the user. This objective clearly implies the need for at least one transform—more generally, for a set of N transforms for some N greater than zero—to be performed between the external and internal levels of the system. The trouble is, today’s direct-image systems provide only a very weak form of data independence (I’m tempted to say they implement the identity transform). As a consequence, we’ve come to think of data independence as little more than just shielding the user from the bits and bytes on the disk. But there’s so much more to it than that! We need to get away from those direct-image transforms. And that’s what TR is all about; TR is, I think, unique in the emphasis it places on the crucial concept of data independence. Every part of the TR model as described in earlier chapters fits naturally within, and contributes to, this overall perspective on the database implementation problem.


Go Faster!


Let me illustrate the foregoing by briefly reviewing the material from those earlier chapters. I began in Chapter 3 by distinguishing three levels of abstraction—the user level, which is relational; the TR level, which is based on TR tables (principally the Field Values Table and the Record Reconstruction Table); and the file level, which is a level of indirection between the other two. (Thus, we’re already talking about at least two transforms, one between relations and files, and one between files and TR tables.) Relations have tuples and attributes; files have records and fields; tables have rows and columns. I stressed the point that TR tables are definitely not the same thing as SQL tables, and I explained at some length why I thought it was better to use relational terminology, not SQL terminology, at the user level. Indeed, I think it’s fair to say that one problem with SQL—one of many, unfortunately—is precisely that it muddies the distinction between the relational and file levels; in some ways, in fact, SQL tends to focus on the file level more than it does on the relational level. Be that as it may, the crucial insight underlying the TR model is this. Let r be some record at the file level. Then: The stored form of r involves two logically distinct pieces, a set of field values and a set of “linkage” information that ties those field values together, and there’s a wide range of possibilities for physically storing each piece. In direct-image systems, the two pieces are kept together, and the linkage information is represented by physical contiguity. In TR, by contrast, the two pieces are kept separate; the field values are kept in the Field Values Table, and the linkage information is kept in the Record Reconstruction Table. That separation (which represents a major logical transform right away, of course) effectively allows the very same stored data to be kept sorted in many different ways at the same time. It’s rather like having many different pointer chains running through the same set of stored data at the same time; however, the big difference is that, in TR, (a) those pointer chains are separate from the stored data as such, and (b) they effectively connect fields, not records (contrast pointer chains as found in CODASYL systems, as described in Chapter 2). Also, we saw in Chapter 14 that those “chains” of pointers aren’t necessarily chains anyway, in TR. Memory Implementation In Chapters 4‑10, I presented the basic ideas of the TR model while ignoring (for the most part) the special problems of implementing that model on disk, and I’ll follow the same pattern in this brief review. First of all, then, let’s go over the way the Field Values Table and Record Reconstruction Table might be implemented in memory (for full details, see Chapter 4). In its simplest form, the Field Values Table has a column for each field in the corresponding file, and the entries in a given column consist of the field values from the corresponding column arranged into sorted order. Let cfv and crr denote cell [i,j] of the Field Values Table and cell [i,j] of the Record Reconstruction Table, respectively. Let r be that record of the file whose jth field value appears in cfv, and let the (j+1)st field value of r appear in cell [i',j+1] of the Field Values Table. Then crr contains i'. Thus, the Record Reconstruction Table allows any or all of the records in the file to be reconstructed—by means of the zigzag algorithm—from the Field Values Table. Moreover, entering the Record Reconstruction Table on any particular column j and reconstructing the record corresponding to cell [1,j], then the record corresponding to cell [2,j], and so on, will eventually reconstruct a version of the file whose records are ordered by values of field j.


Go Faster!


Let me remind you that the Field Values Table and the Record Reconstruction Table both start out being isomorphic to the corresponding file—that is, they both have the same number of rows and columns as that file has records and fields, respectively. What’s more, the Record Reconstruction Table stays isomorphic in this sense; however, the Field Values Table ceases to do so when the condensed- and merged-column transforms are introduced (see below). Let me also remind you that the Field Values Table is the only TR-level construct that contains user data as such; all the rest—the Record Reconstruction Table, also the Permutation Table and others—contain implementation information (mostly pointers). Finally, let me remind you that those pointers can usefully be thought of surrogates for the corresponding field values. In Chapter 5, I briefly discussed what’s involved in inserting new records and in retrieving, deleting, or updating the records that “pass through” some given cell of the Record Reconstruction Table. I explained how DELETE didn’t physically remove information from the database but merely flagged it as “logically deleted,” and how subsequent INSERTs could then reuse such logically deleted items. I pointed out that finding records and retrieving them (or deleting or updating them) were logically distinct processes, and I explained how TR took advantage of that fact. And I explained how all of these operations effectively took place at the field level rather than the record level. I also discussed “symmetric exploitation” and the possibility of corresponding symmetry of performance (but that’s an issue I want to come back to in the next section). In Chapter 6, I discussed update operations in more depth. In particular, I explained the swap algorithm for implementing INSERT operations; I also sketched an alternative approach based on the use of a separate overflow structure, and pointed out that such an approach enjoyed many advantages (not only in the area of performance but also, and importantly, in the area of backup and recovery). Note: Yet again we’re talking about some important transforms. I won’t keep on saying this. Next, recall that the Record Reconstruction Table corresponding to a given file (and given Field Values Table) isn’t unique, in general, owing to the fact that most fields in most files involve duplicate values. In Chapter 7, I showed how we can take advantage of this fact; to be specific, I showed how certain “preferred” Record Reconstruction Tables could be used to provide several major-to-minor orderings simultaneously (as well as, a fortiori, several individual field orderings simultaneously). And I also showed in that chapter (as well as in Chapters 5 and 7) how to use the Permutation and Inverse Permutation Tables as a basis for building any desired Record Reconstruction Table, “preferred” or otherwise. “Preferred” Record Reconstruction Tables constitute one of several important refinements to the basic TR model; although those refinements might be thought of as frills, in a sense, they’re so important and useful that (it seems to me) they’re virtually certain to be supported in any real implementation. In Chapter 8, I took a look at another important refinement, condensed columns. The idea here is that columns in the Field Values Table can be “condensed” by removing redundant duplicate field values (keeping instead, with each individual field value that remains, a row range indicating which rows would have contained that value in the corresponding uncondensed version of the Field Values Table). As well as representing a possibly dramatic saving in storage space (see the subsection entitled “The Copernican Analogy” in the next section), condensed columns make update operations (and retrieval operations too, quite probably) much more efficient. I remind you that a condensed column can usefully be thought of as a histogram.


Go Faster!


Of course, condensing the Field Values Table in the foregoing sense does make file and record reconstruction a little more complicated, and possibly a little less efficient. We can fix this problem by expanding the Record Reconstruction Table, such that each cell now includes two pointers (that is, two row numbers) instead of one. One pointer is the same as before—it identifies the appropriate “next” cell in the Record Reconstruction Table—while the other is a direct pointer to the cell of the Field Values Table that contains the corresponding field value. In Chapter 9, I discussed yet another important refinement, merged columns. The idea here is that distinct fields at the file level might map to the same column in the Field Values Table, eliminating further redundancy, and in particular making joins more efficient. What’s more, the distinct fields in question don’t have to come from the same file—the only requirement is that they must be of the same data type; thus, there’s no longer necessarily a one-to-one correspondence between files at the file level and Field Values Tables at the TR level (note the implications here for candidate and foreign keys in particular). In the extreme case, in fact, there could be just a single Field Values Table for the entire database. In relational terms, such an implementation would effectively mean that we were storing attributes instead of tuples (and those stored attributes would never contain any duplicate values). Note: In characterizing such an implementation in such a manner, I’m tacitly regarding (for example) attribute S# in the suppliers relation S and attribute S# in the shipments relation SPJ as “the same” attribute. This interpretation is consistent with the formal definition of the term attribute as found in, for example, reference [40]. What’s more, since it’s even possible for that single Field Values Table to include “logically deleted” values that don’t currently appear in any user relation at all, such an implementation might reasonably be characterized as one—or as approaching one—that stores domains (= types), not just attributes.





22-08-2014 12:56:57


Go Faster!


Next, in Chapter 10, I indicated what was involved in using TR to implement the relational operators, and gave evidence to support the claim that those implementations should be especially efficient. Joins in particular involve linear costs instead of multiplicative ones; in my opinion, this fact by itself—even if it was the only advantage provided by TR—would still be more than sufficient to place TR head and shoulders above its competitors. Note in particular that it implies that joins are scalable;1 as a consequence, if some query fundamentally requires N joins, then it’s all right to go ahead and request those N joins, regardless of the value of N. By contrast, it’s well known that direct-image implementations are effectively incapable of handling values of N that are greater than some fairly small lower bound (perhaps seven or eight). Yet a properly designed database could easily have several hundred relations, and realistic queries could easily involve a 20- or 30-way join. Disk Implementation In Chapters 11‑14, I turned my attention to the question of implementing the TR model on disk. In Chapter 11, I explained the basic problem: We need to do everything we can to minimize disk seeks. More specifically, we want as much of the database as possible to be memory-resident at run time, and we want a good data representation on disk to reduce the amount of seeking we have to do when we do have to do it. The overall objective is to try and get “main-memory performance off the disk.” Chapter 11 also described some of the logical and physical compression techniques that TR uses to address the foregoing problems. Note in particular that (at least to a first approximation) those techniques have the effect of ensuring that the Field Values Table will always be memory-resident. But the Record Reconstruction Table has the potential to be much larger than the Field Values Table and therefore still presents a problem. Chapter 11 included an overview of certain TR-specific approaches to that problem, while the next three chapters described three of those TR-specific solutions in more detail. Chapter 12 explained the use of file factoring to reduce the problem to one of dealing effectively with “large files.” Chapters 13 and 14 then addressed this latter problem in detail; Chapter 13 discussed the use of file banding, and Chapter 14 examined the possibility of using stars instead of zigzags in the Record Reconstruction Table. Chapters 13 and 14 also raised the possibility of judicious use of controlled redundancy. Let me conclude this brief summary by reminding you that, despite its comparatively low-level nature, TR is still an abstract model, and is accordingly capable of many different physical implementations. Several physical implementation alternatives were touched on at various points in previous chapters.

15.3 Analysis Clearly, TR differs radically from conventional direct-image approaches to implementation; to say it one more time, it’s a transform technology, not a direct-image one. In this section, I want to describe in outline a variety of ways in which TR’s transform technology might reasonably be characterized. The ways in question are all ones that I think can help explain the fundamental significance of the transform idea and can provide some insight, at least by analogy, into what TR is really all about.


Go Faster!


The Logarithm Analogy The first analogy is with logarithms (I mentioned this one briefly in Chapter 1). The idea is that, in a sense, TR’s transform technology does for database processing what logarithms do for numeric processing. As I put it in Chapter 1: [Logarithms] allow what would otherwise be complicated, tedious, and time-consuming numeric problems to be solved by transforming them into vastly simpler but (in a sense) equivalent problems and solving those simpler problems instead ... [and] TR does the same kind of thing for data management problems. —from Chapter 1 Let’s think about logarithms for a moment. We all know the pragmatic difficulties involved in carrying out typical arithmetic operations on large numbers: There is nothing more troublesome in mathematics than the multiplications, divisions, square and cubic root extractions of great numbers, which involve a tedious expenditure of time, as well as being subject to “slippery errors.” (These remarks are due to John Napier, the inventor of logarithms. The quote is from Jan Gullberg’s book Mathematics: From the Birth of Numbers, W. W. Norton and Company, 1977.) Before Napier came along, such “multiplications, divisions, [and] ... root extractions” were, at best, hugely labor-intensive and time-consuming; at worst, they couldn’t be done at all, because the amount of time required was prohibitive. (Does this sound familiar?) Logarithms solved this problem. As already noted in the quote from Chapter 1, they did so by means of certain transforms: They allowed the objects of interest (numbers) to be transformed into a new—and incidentally unfamiliar—representation; that transform then allowed the operators of interest (multiply, divide, etc.) to be transformed into other, more familiar and much simpler, operators (add, subtract, etc.). For example, suppose we need to multiply two large numbers x and y. Then we proceed as follows: 1. First, we transform the numbers x and y into their logarithms x' and y', say. These transforms are done by looking the logarithms up in a precomputed table. 2. Next, we transform the operation of multiplying the two numbers into the much easier one of adding their logarithms x'and y', thereby obtaining a result z' say. 3. Finally, we transform z' into the desired result z by looking up the antilogarithm of z' in another precomputed table. Not only do the foregoing transforms make the problem much easier to solve, they also drastically reduce the amount of time involved—from multiplicative time to additive or linear time, in fact. (Again, does this sound familiar?)


Go Faster!


Now let’s get back to TR. TR also transforms the objects of interest—in this case, data files—into a new and unfamiliar representation, the Field Values and Record Reconstruction Tables. And then it transforms the operators of interest (value lookups and sequential searches) into more familiar and much more efficient operators, such as binary search, on those tables. The net effect, as with logarithms, is that: • Problems that were difficult and excessively time-consuming with the traditional approach become easy and fast with the new approach. • Problems that were effectively intractable with the traditional approach become feasible with the new approach. • More generally, problems that required multiplicative time with the traditional approach require only linear time with the new approach, and problems that required linear time with the traditional approach require only logarithmic time with the new approach. There’s one more point to be made. With both logarithms and TR technology, all of the “heavy lifting” is done just once, in advance. In the case of logarithms, the lookup tables are precomputed; that is, the work of computing the actual logarithms and antilogarithms is done once, ahead of time, instead of being repeated over and over again every time we want to do some numeric calculation. In the same kind of way, with TR, the Field Values and Record Reconstruction Tables are also precomputed (at load time, in fact); that is, all of the data sorting and merging is done ahead of time, instead of over and over again every time we want to access the database (when executing some query, for example). And in both cases, doing the “heavy lifting” just once in advance translates into overwhelming cost benefits.2


SetaPDF

SETASIGN




Go Faster!


The Copernican Analogy There’s another analogy that I think is helpful, too, and that’s with the Copernican revolution—that is, the conceptual shift from the view in which the sun (and everything else) revolved around the earth to one in which the earth revolved around the sun instead.3 As we all know, the perception that the sun revolves around the earth makes a kind of intuitive sense (and might even be defended, to some extent, on relativistic grounds), but it’s certainly misleading if you want to understand the bigger picture. Well, in the same kind of way, the perception that relations consist primarily of tuples, and that those tuples then only secondarily contain individual data values, also makes sense—indeed, it’s logically correct—but, again, it can be misleading if you want to understand the bigger picture. Part of the problem here lies with books like this one. When such books show relations in pictorial form (that is, as SQLstyle tables), for obvious reasons they always use examples that involve very few tuples (see any of the examples in the present book). As a consequence, the pictures in question always look like nice neat little rectangles, and the tuples and the attributes “carry equal weight,” as it were. But relations in real databases aren’t like that—at least, not usually; more usually, such relations involve comparatively few attributes but several millions or even billions of tuples, and the true picture becomes very long and skinny, almost more like a long thin piece of string than a “nice neat little rectangle.” (Even with nice neat little rectangles, in fact, it’s often psychologically easier to read down the columns rather than across the rows, a state of affairs that I think lends weight to the present argument.) If we think of relations in this way, it becomes clear that it’s the attributes, not the tuples, that are the real implementation problem; for example, we need to worry much more about how to search down the attributes than we do about how to search across the tuples. In other words, we need to make a conceptual shift from a tuple-oriented to an attributeoriented point of view. Making that shift is, in a way, what the TR approach does: First, we break the records up into their constituent fields and sort the data by each field individually (of course, now I’m talking about the file analog of the relation in question), and only later do we worry about connecting the field values back together again to form the corresponding records. As we know, this approach is the exact opposite of the traditional direct-image approach, in which the records aren’t broken up at all but are kept connected by physical contiguity. Thus, in the direct-image approach, the records are necessarily kept sorted in just one sort order, and redundant auxiliary structures then have to be introduced in order to obtain the effect of sorting the fields individually. As we also know, it’s this shift in perspective that allows us to introduce additional important techniques such as condensed and merged columns. (In this connection, I’d like to remind you in particular of the huge amount of data compression that those techniques make possible—recall the example from Chapter 8 of a relation representing drivers’ licenses, where we had 20 million tuples but only ten different hair colors, perhaps.)


Go Faster!


In a nutshell, the shift from a tuple- to an attribute-oriented point of view, like the shift afforded by the Copernican revolution, shows how things “really” fit together behind the scenes: In both cases, it’s the “right” way to think about the problem, and it’s the key to the “right” solution. What’s more, the shift has surprisingly deep and powerful implications in both cases, implications that go far beyond the initial simple recognition of the shift as such to a truly fundamental conceptual transformation underneath the surface. In the case of TR in particular, that conceptual transformation seems to me to be the breakthrough that’s needed in order to “do relational databases right”; instead of making comparatively small and incremental improvements, which is what database administrators, DBMS implementers, and database researchers have been doing for years, we can take a totally fresh approach to the problem, one that (as we’ve seen) provides huge performance—and other—benefits. TR vs. Indexing Now I want to say more about those redundant auxiliary structures; in particular, I want to say more about indexes. We’ve seen that TR does away with the need for indexes. Or does it? In what follows, I’d like to examine this question from a slightly different point of view. Consider Fig. 15.1 (essentially a repeat of Fig. 4.1 from Chapter 4), which shows a possible file for suppliers, and Fig. 15.2, which shows a corresponding Permutation Table. Just to remind you, column S# in this latter table contains “the S# permutation”—that is, it shows that sorting the file of Fig. 15.1 by ascending supplier number returns the records in the sequence 4, 3, 5, 1, 2—and similarly for the other columns.

Fig. 15.1: A suppliers file

Fig. 15.2: A Permutation Table corresponding to the file of Fig. 15.1


Go Faster!


Observe now that the S# permutation is an index, in a sense!—at least, it does provide the functionality of a conventional index.4 And, of course, analogous remarks apply to the other permutations, too. Given that the permutation notion plays such a crucial role in TR, therefore, we might say, not that TR dispenses with indexes, but rather that indexes are essential. In fact, we might quite reasonably say that the TR internal structures—the Field Values Table and the Record Reconstruction Table—are obtained by building indexes on everything, connecting all of those indexes together (but storing the field values and the linkage information separately), and then throwing away the indexed file. Of course, it’s reasonable to talk in the way I’ve just been doing only if we have a very clear idea of what we really mean. Certainly TR does dispense with indexes as conventionally understood (and so it also dispenses with all of those undesirable consequences of such indexes as described in Chapter 2). After all, TR clearly does away with the notion of the stored file as a direct image of a user-level relation; it therefore also a fortiori does away with the notion of there being a distinction between such a file, on the one hand, and indexes over such a file, on the other. Thus, in the very act of doing away with the direct-image file, TR also does away with the idea of an index that points into such a file, which includes most or all of indexing as conventionally understood—and so I stand by my claim that TR abolishes the need for indexing in the

360° thinking

conventional sense. Yet this abolition of indexing in the conventional sense is effectively accomplished by absorbing the functionality of such indexing into TR’s own internal structures.

360° thinking

.

.

360° thinking

.






251 Discover the truth at www.deloitte.ca/careers Click on the ad to read more Download free eBooks at bookboon.com


D

Go Faster!


I’d like to expand a little on the foregoing. Conventional DBMSs involve a whole host of extremely difficult performance questions, some of which have to be answered by the database administrator (for example, “Which indexes should I build?”) and some by the system optimizer (for example, “Which indexes should I use?”). And how those questions are answered typically has huge implications for system performance—meaning there are huge penalties to pay if the answers are wrong. Now, TR doesn’t do away with such questions altogether, but it certainly does do away with many of them. And those questions that remain tend to be much easier to answer, and to have far less drastic performance implications, than their counterparts in conventional systems. In many cases, in fact, the implementation can probably answer the question for itself, or at least provide some sensible default answer; for example, user-level attributes of the same type might automatically cause a corresponding merged column to be built in the Field Values Table at the TR level. Automating decisions in this manner can obviously help to reduce the load on the database administrator still further. However, there will doubtless always be a need for some kind of “manual override” in certain situations.5 TR and Hyperplanes The final characterization of TR that I want to discuss here is one you might find appealing if you happen to be mathematically inclined. Recall these remarks from Chapter 2: [A] relation can ... be pictured as a table. However, a relation is not a table. A picture of a thing isn’t the same as the thing! In fact, the difference between a thing and a picture of that thing is another of the great logical differences .. —from Chapter 2 Although these remarks are undoubtedly true, it’s also true that it can often be very convenient, informally, to think of a relation as a table. Tables are “user-friendly”; the fact that we can often think of relations, informally, as tables—sometimes more explicitly as “flat” or “two-dimensional” tables—makes relational systems intuitively easy to understand and use, and makes it intuitively easy to reason about the way such systems behave. Indeed, it’s a very nice property of the relational model that its basic data structure, the relation, has such an intuitively attractive pictorial representation. Unfortunately, many people have let themselves be blinded by that attractive pictorial representation into thinking that relations as such are “flat” or “two-dimensional.” Perhaps even more unfortunately, this criticism has historically applied to DBMS implementers in particular—a fact that presumably accounts for the conventional direct-image approach to implementation found in most SQL systems on the market today. Indeed, we might quite reasonably characterize those direct-image implementations as “flat” or “two-dimensional,” and we already know from Chapter 2 the problems that such implementations lead to. But, in general, relations simply aren’t two-dimensional. Rather, if a given relation has N attributes, then each tuple in that relation represents a point in a certain N‑dimensional space—and the relation as a whole represents a set of such points. In other words, relations are N‑dimensional, not two-dimensional! As I’ve written elsewhere (in quite a few places, in fact): Let’s all vow never to say “flat relations” ever again.


Go Faster!


Let’s agree to refer to the points in a given N‑dimensional space as “N‑points,” for brevity. Then the overall database can be regarded as a collection of such N‑points. Of course, N will have different values for different points in the database, in general; and even when two points do have the same value for N, the points in question might be based on different dimensions. For example, the suppliers relation S and the shipments relation SPJ both contain tuples representing, specifically, 4‑points; however, the underlying dimensions are S#, NAME, INTEGER, and CHAR in the case of suppliers, and S#, P#, J#, and INTEGER in the case of shipments. Now let’s focus for a moment on just one of those 4‑points: let’s say the 4‑point representing the shipment for supplier S1, part P1, and project J1, with quantity 200. Consider some particular attribute value within that shipment tuple, say the supplier number S1. The TR representation of that attribute value involves a cell in the Field Values Table, and of course that cell is directly linked, via an appropriate zigzag or star, to the TR representations of all other attribute values from the same shipment tuple. What’s more—thanks to the condensed-columns technique described in Chapter 8—it’s also directly linked, via other zigzags or stars, to the TR representations of all other attribute values in all other shipment tuples with the same supplier number. In other words, all shipment 4‑points with “the same S# coordinate” (if I might be allowed to talk in such terms) are directly linked together at the TR level. And, of course, the same is true for all shipment 4‑points with the same P# coordinate, or the same J# coordinate, or the same QTY coordinate. In this sense, the TR representation of any given relation can reasonably be regarded as being directly N‑dimensional: All “points” (that is, all tuples) in that relation that belong to the same “hyperplane” (see the next paragraph but one) are directly connected together at the TR level. By contrast, conventional direct-image implementations—precisely because they are direct-image and thus very close to the picture the user sees—can be regarded as being two-dimensional; to be specific, distinct points from the same hyperplane in such an implementation are represented independently of one another, and the connections among them therefore have to be explicitly represented by independent auxiliary structures such as indexes. There’s more. Thanks to the merged-columns technique described in Chapter 9, all shipment 4‑points with a given S# coordinate can also be directly linked at the TR level to the (unique) supplier 4‑point with the same S# coordinate. In fact, if we take the merged-columns idea to its logical conclusion, in which there’s just one Field Values Table for the entire database, then we can say that whenever two tuples are logically connected at the relational level (because they have some attribute value in common), then their internal representations are directly linked at the TR level. In such a situation, TR can be regarded as providing a directly N‑dimensional representation of the entire database. And, of course, it’s that N‑dimensional representation that (among other things) allows joins to be done in linear time, as we’ve already seen. It’s also what allows both of the following tasks to be carried out efficiently: (a) Given a particular tuple, find all of its attribute values; (b) given a particular attribute value, find all of the tuples that contain it (see Chapter 11, Section 11.2). Note: In case you’re not familiar with the concept, let me explain what I mean by the term “hyperplane.” In ordinary three-dimensional space, where points are identified by three coordinates x, y, and z, the set of all points with the same x-coordinate forms a plane (and likewise for the set of all points with the same y-coordinate or the same z-coordinate). More generally, in any given N‑dimensional space, the set of all points with some given coordinate in common forms a hyperplane. Thus, to say that two N‑points belong to the same hyperplane is just a fancy way of saying they have some common coordinate. Observe that any given N‑point can be regarded as the intersection of N such hyperplanes (and the database as a whole can thus be thought of as a collection of intersections of hyperplanes).


Go Faster!

15.4


A Review of the Benefits

In this section, I want to try and bring together in one place a summary of all of the many benefits I believe TR can provide. Some of those benefits have been discussed previously, others are new. Note: I should explain right away—as I’ve done elsewhere, in a somewhat similar context [34]—that the points that follow are all very much interwoven; sometimes they’re even the same point in different guises. It’s always hard to structure this kind of material completely orthogonally. Be that as it may, I’d like to begin by quoting some extracts from reference [63] and offering some comments on those extracts. The first is, in part, a repeat of some text I quoted in Chapter 1: The present invention provides a new and efficient way of structuring databases [that supports] efficient query and update processing, [reduces] database storage requirements, and [simplifies] database organization and maintenance. Rather than [achieving] orderedness through increasing redundancy (that is, superimposing an ordered data representation on top of the original unordered representation of the same data), the present NY026057B 4 a fundamental level. TMP PRODUCTION 12/13/2013 invention achieves orderedness through eliminating redundancy on 6x4

ACCCTR0

PSTANKIE

—from the Initial Patent Bookboon Ad Creative

gl/rv/rv/baf

Comment: If you’ve managed to read the book this far, you should be in a position to understand exactly what’s being claimed here and—I hope—agree with it. —◆◆◆◆◆—


© 2013 Accenture.




Go Faster!


[Conventional implementation approaches] contain key structural weaknesses, including high levels of unorderedness and redundancy, that have traditionally been regarded as unavoidable. For example, [data in such implementations] can be sorted ... on at most one criterion ... This limitation renders essential database functions such as querying ... on all criteria other than this privileged one ... awkward and overly resourceintensive ...[It] obscures natural and exploitable latent data relationships that are revealed by more ordered, condensed, and efficient data arrangements [and] leads to negative characteristics of state-of-the-art DBMSs such as unorderedness, redundancy, cumbersomeness, algorithmic inefficiencies, and performance instabilities. —from the Initial Patent Comment: The “key structural weakness” of the first sentence here is, of course, the conventional direct-image style of implementation, in which user-level tuples map more or less directly to physically stored records (what I called in the previous section a “flat” or “two-dimensional” representation). As the quoted extract suggests, that direct-image style has simply been taken as a given in most prior work. The breakthrough represented by the TR approach implies that numerous traditional assumptions underlying prior investigations into physical implementation are no longer valid. The “more ordered, condensed, and efficient data arrangements” that TR technology makes possible are, of course, the condensed and possibly merged Field Values Tables and the associated Record Reconstruction Tables. Let me also offer a few comments on those “negative characteristics of state-of-the-art DBMSs”: • Unorderedness: This one’s obvious—the (unique) physical ordering of a conventional stored file clearly reflects at most one sensible logical ordering, and possibly none at all. • Redundancy: There are at least two points here. First, the auxiliary structures (typically indexes) that are introduced to address the problem of unorderedness involve redundancy by definition. Second, the fact that column condensing and merging can’t be used in a direct-image implementation means that the very same individual field values are typically repeated many times (possibly very many times) in storage, both within and across distinct stored files. • Cumbersomeness: The vast array of auxiliary structures supported in conventional DBMSs—all of which are ad hoc to a degree—can certainly lead to cumbersome representations, representations that are difficult to design in the first place and can be difficult to change later, too. What’s more, the DBMS code itself, which has to deal with all of these different representations, can be cumbersome and difficult to manage as well. • Algorithmic inefficiencies: By way of example here, consider what’s involved in implementing joins or aggregations in a TR system vs. what’s involved in doing the same thing in a conventional system (see Chapter 10). The TR implementations are clearly much more efficient.


Go Faster!


• Performance instabilities: And by way of example here, consider the difference in a conventional DBMS between doing a restriction operation when a suitable index exists vs. doing the same thing when it doesn’t. Or consider the difference, again in a conventional DBMS, between doing a join when the stored versions of the relations involved are suitably sorted ahead of time vs. doing the same thing when they aren’t. Again, the TR implementations are clearly much more efficient, and questions such as “Does a suitable index exist?” and “Is the data suitably sorted?” simply don’t arise. —◆◆◆◆◆— These supplementary structures are inherently, and often massively, redundant ... [and] typically grow to be overly lengthy, convoluted, and ... cumbersome to maintain, optimize, and especially update. —from the Initial Patent Comment: The “supplementary structures” mentioned here are, of course, the auxiliary structures, typically indexes, introduced as noted previously to overcome the problem of “unorderedness” in conventional DBMSs. “Cumbersome to update”: As we saw in Chapter 2, indexes might perhaps speed up queries, but they certainly slow down updates—partly because of the “inherent redundancies” also mentioned in the quote. Updates are faster in TR in part because there simply aren’t any auxiliary structures to update. —◆◆◆◆◆— [Data in TR] is much more easily manipulated than in traditional databases, often requiring only that certain entries in the [Record Reconstruction Table] be changed, with no copying of data. —from the Initial Patent Comment: This extract refers primarily to the TR mechanism by which stored field values can be shared across stored records (see Chapters 8 and 9). Among other things, that mechanism allows the user to insert new tuples without new attribute values having to be physically inserted, and it allows the user to delete existing tuples without existing attribute values having to be physically deleted. In other words, “data manipulation” or update operations—meaning INSERT, DELETE, and UPDATE operations, as discussed in Chapter 6—can be very efficient, too. —◆◆◆◆◆— [Certain] operations such as [histogram] analysis, data compression, and [obtaining a variety of distinct] orderings, which are computationally intensive in [conventional DBMSs], are obtainable immediately from the structures described herein. The invention also provides improved processing in parallel computing environments. —from the Initial Patent


Go Faster!


Comment: The first sentence here is self-explanatory. Regarding the second sentence, I did mention at the very end of Chapter 3 that the TR tables are suitable for implementation in a multiprocessor environment, if such is available. The details are beyond the scope of this book; however, reference [63] does include several suggestions as to how parallel processing algorithms might be used to improve TR performance—for example, searches on columns of the Field Values Table might well be parallelized, and the same is true for the sorts that are needed to build the Field Values Table in the first place. —◆◆◆◆◆— Now let’s revisit some of the problems that I claimed in Chapter 2 come with the use of indexes and other conventional auxiliary structures, and see in each case how TR overcomes those problems: • DBMS implementation complexity: The complexity in question arises from the need for the DBMS to deal with many different auxiliary structures and associated access methods, and in particular from the consequent need for the optimizer to carry out the process of access path selection. The radical new TR internal structures (primarily the Field Values Table and Record Reconstruction Table) address this problem directly by eliminating unnecessary options at the physical level. For example, the optimizer doesn’t have to decide whether or not to use an index, because there aren’t any indexes, and that’s because TR doesn’t need any indexes (at least, not in the conventional sense—see the subsection entitled “TR vs. Indexing” in the previous section). • Stored data redundancy: See the discussion of redundancy earlier in this section. Note: As explained in Chapters 13 and 14, controlled redundancy can have its uses. Of course, the kind of redundancy introduced by indexes and other auxiliary structures is controlled too—but it isn’t necessary.





Go Faster!


• Additional storage space requirements: Even if we limit our attention to the raw data alone and ignore the additional storage space requirements of auxiliary structures, the TR representation needs far less storage space than conventional structures (an 80 percent reduction is not atypical). In other words, the TR representation—especially when columns are condensed and merged—can be thought of as a highly compressed representation. What’s more, the compressions in question have the effect of speeding up access as well as drastically reducing storage space, and the compression and decompression algorithms themselves are very fast. • Physical database design complications: The fact that traditional DBMSs offer so many different auxiliary structures and access methods (see DBMS implementation complexity above) means that physical database design in such a system can be a very difficult task—especially since there are few solid guidelines for choosing between physical design alternatives. The TR structures directly address this problem, too, again by eliminating unnecessary physical design options. • Reorganization and tuning: Following on from the previous point, traditional DBMSs typically require both (a) periodic physical database reorganization, and (b) constant tuning and retuning, in order to meet a variety of performance goals. The need for such reorganization and tuning is greatly reduced in TR—even eliminated altogether, in many cases. Note: In connection with the foregoing, it’s worth mentioning that Codd himself is on record as stating (in reference [8]) that one of his objectives in introducing the relational model in the first place was “to simplify the potentially formidable job of the database administrator.”6 And, while it might be argued that the database administrator’s job in today’s SQL systems is simpler than it was in preSQL systems, I don’t think anyone could reasonably claim that those SQL systems make that job easy. And it seems to me that the root cause of the problem is the direct-image style of implementation still found in those systems. The relevance of TR to Codd’s objective is obvious. • Logical database design complications: As I said in Chapter 2, physical design considerations should in principle have no impact on logical design, but in practice they usually do (once again because of the directimage style of implementation). As a particularly egregious example, how often have we been told that we must “denormalize for performance”? As I’ve written elsewhere [27], denormalization (or something akin to denormalization, at any rate), if it must be done, should be done at the storage level, not the user level, but the almost one-to-one relationship between those two levels in conventional DBMSs has meant in practice that denormalization is invariably done at the user level too. As noted in Chapter 12, by contrast, in TR there’s no need to denormalize at the user level at all, thanks primarily to the fact that joins are so fast. Hence, we can—at last—achieve the benefits of properly normalized designs, without having to pay any associated performance penalty. (As for denormalizing at the storage level, in TR the question doesn’t even arise, because TR doesn’t physically store relations, as such, at all.) • Query inefficiencies and overheads: As explained in Chapter 2, the inefficiencies and overheads in question both occur because of the access path selection process. Since TR largely eliminates that process, the problem goes away.


Go Faster!


• Update inefficiencies and overheads: The inefficiencies and overheads that occur with queries because of the access path selection process go away here too, for the same reason. Also, I noted earlier that indexes and other auxiliary structures slow down the update process; since TR has no such structures, that problem goes away too. • Data independence: See the discussion in Section 15.2. —◆◆◆◆◆— Next I’d like to pull together a few miscellaneous points (they’re mostly repeats of points I’ve already made elsewhere, but I’d still like to include them explicitly here): • Symmetric performance: I explained in Chapter 5 that the relational model provided “symmetric exploitation” but that implementations prior to TR didn’t provide any comparable symmetry in performance. But TR— even if it doesn’t provide symmetry in performance 100 percent—certainly comes much closer to doing so than previous approaches ever did. This is because the TR data representations are themselves very symmetric in nature. To my mind, this fact is a virtue in itself—it adds an element of “rightness,” as it were. As George Polya says (admittedly in a rather different context) in his book How to Solve It [62]: “Try to treat symmetrically what is symmetrical, and do not destroy wantonly any natural symmetry.” I’ve always found this advice of Polya’s a most valuable precept to follow in my own work on the relational model and related matters. • High performance: Of course, TR doesn’t just provide symmetric performance, it provides very good performance, too. Indeed, I opened Chapter 1 by saying that somebody had at last implemented the go faster! command, and we could now build DBMSs that were “blindingly fast.” What’s more, the performance advantage of TR over traditional systems increases dramatically with the complexity of the query; the more complex the query, the greater the gain (see Chapters 5 and 10). However, I would hope by now that you realize that high performance is only one of the many benefits that TR technology can provide. Certainly it’s a critically important benefit, but, to say it again, it’s not the only one. • Join performance: In connection with the performance issue, I really have to repeat this particular point, because it’s so significant (I’m tempted to say staggering): Joins involve linear instead of multiplicative performance costs (in other words, joins are scalable). As I said before, this fact by itself is sufficient in my opinion to place TR in a class of its own, quite apart from all of its other advantages. • Update performance: We saw in Chapter 6 that the TR transforms don’t imply good performance for retrieval at the expense of update; update performance is good, too. • Direct end-user access: If performance is no longer an issue, then (as noted in Chapter 1) there’s no need for the IT department to keep end-users shut out from their own data. In other words, end-users should be able to access the database directly for themselves, without having to go through the potential bottleneck of the IT department.


Go Faster!


• Concurrency control: Now this is a topic I haven’t discussed in this book at all, prior to this point; nor do I mean to get into a detailed discussion of it at this late juncture. The fact is, however, the TR internal structures form a good basis on which to implement sophisticated locking techniques, including (though not limited to) techniques that—like the retrieval and update operations discussed in the body of the book— essentially operate at the level of individual fields instead of records. What’s more, locks are typically held for a much shorter time, precisely because queries and updates are so fast. —◆◆◆◆◆— Let me conclude this review of TR benefits with one more item from the TR documentation (it’s based on some remarks in an internal document, but I’ve edited those remarks considerably here). I think it pretty much speaks for itself. With traditional DBMSs, the database administrator’s job typically involves a complicated balancing act among four independent sets of requirements: • Query performance: We want queries to perform well. • Update performance: We want updates to perform well, too. • Storage space: We want to keep the physical size of the database within reasonable bounds.

30 FR da EE ys tria


l!

Go to

BrowserTexting.com


...


BrowserTexting


Go Faster!


• Optimizability: Given that traditional optimizers are far from perfect, we want to stay within the bounds of what the optimizer can reasonably be expected to handle. The trouble is, although the requirements are independent, the mechanisms for meeting them in conventional DBMSs typically aren’t. But TR is different—TR replaces the usual series of vexing tradeoffs with dramatic improvements in all of these areas simultaneously.

15.5

Possible Future Developments

In this section, I’d like to speculate briefly about possible future applications of TR technology as I’ve described it in previous chapters. However, I must immediately make it clear that everything that follows is my opinion only; in particular, I’m categorically not “preannouncing” any TR products, nor am I disclosing anything from any of the follow-on patents. Rather, I just want to describe what might be thought of as a “future directions wish list” on my own part. What’s more, I strongly suspect that some of the items in the list will require certain extensions to the TR model as described in previous chapters. In a way, just about everything I want to say in what follows can be regarded as part of the same overall point: Let’s implement the relational model! In other words, it’s my belief that if we were to build a true relational DBMS, as Hugh Darwen and I have advocated in The Third Manifesto [40]—and I’ve tried to suggest all through this book that TR technology would be ideally suited to that task—then we would at least have the right framework for implementing all of the other items that I indicate below might be desirable. In fact, I want to go further; I want to suggest that trying to implement those desirable items in any other kind of framework is likely to prove more difficult than doing it right.7 Be that as it may, a true relational system would include direct support for all of the relational operators discussed in Chapter 10 and others besides, including at least attribute rename, semijoin, semidifference, compose, and transitive closure (TCLOSE).8 It would also include direct, comprehensive, and systematic—that is, not ad hoc—support for relational comparisons (for example, the ability to test whether two relations are equal, whether one is a subset of another, and so on), integrity constraints, and view updating. All of these matters are discussed in detail in one or both of references [32] and [40]. Now, one aspect of the relational model that’s very widely misunderstood is the following (and this observation is relevant to just about everything else I want to say in this section): The relational model has absolutely nothing to say regarding the nature of the types over which relations are defined. In particular, although people tend to think of those types as being very simple—integers, character strings, and so forth—there’s absolutely nothing in the relational model that requires them to be limited to such simple forms. Thus, we might have an “audio recordings” type, a “geographic map” type, a “video recordings” type, an “engineering drawings” type, a “legal documents” type, a “geometric objects” type, and on and on.


Go Faster!


Relation types are an extremely important special case of the foregoing. That is, the system should support types whose values are relations, and therefore should also support relations with attributes of such types; in other words, it should support relations with attributes whose values are relations in turn (“relation-valued attributes”). A simple example is given in Fig. 15.3.

Fig. 15.3: A relation with a relation-valued attribute

Note: You might have encountered claims in the literature to the effect that relation-valued attributes violate the requirements of normalization (indeed, I’m on record as having made such claims myself—in earlier editions of reference [32] in particular). Such claims are incorrect, however. See reference [32] for further explanation. Support for relation-valued attributes involves among other things support for operators, called group and ungroup in references [32] and [40], for mapping between relations without such attributes and relations with them. Also, it turns out that relation-valued attributes are important, at least conceptually, in connection with temporal database support (see the paragraphs immediately following). Interval types are another important special case of types in general. In particular, such types provide the basis for proper temporal database support (which is a crucial aspect of data warehouse systems, albeit one that hasn’t yet been implemented in existing data warehouse products so far as I know). For example, Fig. 15.4 gives an example of a temporal relation; that relation is supposed to show that certain suppliers supplied certain parts during certain intervals of time (you can read d04, d06, etc., as “day 4,” “day 6,” etc.; likewise, you can read [d04:d06] as “the interval from day 4 to day 6 inclusive,” etc.). DURING in that relation is an example of an interval-valued attribute. Note: The similarity between those DURING intervals and the row ranges discussed elsewhere in this book isn’t entirely coincidental.


Go Faster!


Fig. 15.4: A relation with an interval-valued attribute

Support for interval-valued attributes (and hence for temporal databases) involves among other things support for generalized versions of the usual relational operators. For reasons that need not concern us here, those generalized operators are referred to in reference [42] as “U_” operators; thus, there’s a U_restrict operator, a U_join operator, a U_union operator, and so on. Note: Those “U_” operators are all defined in terms of two new relational operators called pack and unpack, and those latter operators in turn are defined in terms of relation-valued attributes. As already noted, therefore, support for interval-valued attributes relies on support for relation-valued attributes, at least conceptually. Again, see reference [42] for further discussion. As reference [42] also explains, proper and complete temporal database support additionally requires proper support for type inheritance.9 Thus, I would like to see TR technology used, not just to implement the relational model as such, but also to implement the type system—including the inheritance portions of that system—defined for the relational model in reference [40]; in fact, I would argue that the type system in question must be implemented if temporal databases are to be supported properly and completely. Of course, proper type support certainly includes support for user-defined types (see the earlier remarks regarding an “audio recordings” type, a “geographic map” type, etc). In fact, I’ve been assuming such support throughout this book— recall the user-defined types S#, NAME, and so on—but I haven’t made a big deal of it. So let me do so now: • The first point is that user-defined type support is the sine qua non—at the user or logical level, in fact, it’s the sole distinguishing feature—of the so-called “object/relational” DBMSs (which in my opinion are, or at least ought to be, just relational DBMSs anyway; once again, see reference [40] for further discussion). Thus, if we use TR technology to build a true relational DBMS, we will necessarily have included full user-defined type support (for otherwise the DBMS wouldn’t be a true relational DBMS, by definition), and so we will in fact have built an “object/relational” DBMS. Indeed, the term “object/relational” is little more than a marketing term anyway; it’s needed only because the term “relational” has, sadly, been usurped (some might say destroyed) by SQL.


Go Faster!


Perhaps I should mention one particular challenge that arises in connection with the foregoing. The fact is, some user-defined types have values that are very large and require a lot of storage (think of the type “video recordings,” for example). Dealing with such types satisfactorily in a TR environment (or any other environment, come to that) looks like it might be an interesting implementation problem. • Of course, user-defined type support includes user-defined operator support, too; that is, if we can define our own types, we must be able to define our own operators as well, because types without operators are useless. In particular, we must be able to define our own operators in connection with system- as well as user-defined types. Note: Reference [40] in fact insists on the provision of certain operators (with prescribed semantics) in connection with every type: “=” (equality comparison), “:=” (assignment), certain “selector” and “THE_” operators, and a few others. But, of course, the user is at liberty to define additional ones as well. • Not all types—in particular, not all user-defined types—are “ordinal” types; that is, some types have no “