When and why are database joins expensive?
Asked Answered
E

7

410

I'm doing some research into databases and I'm looking at some limitations of relational DBs.

I'm getting that joins of large tables is very expensive, but I'm not completely sure why. What does the DBMS need to do to execute a join operation, where is the bottleneck?
How can denormalization help to overcome this expense? How do other optimization techniques (indexing, for example) help?

Personal experiences are welcome! If you're going to post links to resources, please avoid Wikipedia. I know where to find that already.

In relation to this, I'm wondering about the denormalized approaches used by cloud service databases like BigTable and SimpleDB. See this question.

Evolve answered 6/10, 2008 at 9:52 Comment(9)
Are you also looking into the benefits? ;)Inerrable
I'm am looking into an objective (if there is such a thing) comparison. Pro's, con's, what-have-you's.Evolve
The pre-rendered approaches of cloud computing are predicated on being able to bet every which way, avoiding the "wrong join" problem. Google has some whitepapers on their own systems. Quite interesting - ways to extend applicability of the special cases.Dygert
@PeterWone - care to provide a reference to some of those papers? p.s. to answer the question in your profile, Android is Open Source - well, at least partially, so the geeks jumped on that bandwagon. Seen as technically advanced by the great unwashed, they were followed lemming-like into Google's tight and sweaty embrace! Betamax anyone? Closer to my own heart (and generation), how did MySQL (with no FOREGIN KEYs FFS) become (and remain) the world's most popular "R"DBMS when it had competition from PostgreSQL (no native Windows version) and Firebird (Opensourcing fiasco), or even SQLite?Pickard
Needless to say, I regard PostgreSQL and Firebird as vastly superior to MySQL for multi-user systems and SQLite as stellar in the single user sphere. SQLite handles the sqlite.org site (400,00 hits a day!).Pickard
@Vérace - alas I didn't keep notes re the Google papers. I read them in passing while fact-checking - I'm always prepared to consider opposing viewpoints. I can't even remember buzzwords to help you search. All we have is "Google", "2008" and possibly (only a guess) "big data". I still have Borland manuals for Firebird. I'm not sure I hold it in quite the regard you seem to, but I'd take it over MySQL any day especially since Oracle got its tentacles wrapped around MySQL. I think of SQLite as more of a relational filesystem than a server, because that's how I use it.Dygert
Nitpick :-) Borland never issued manuals for Firebird, but for Interbase (upon which Firebird is based). AIUI, Inprise (a renamed Borland) open sourced Interbase, but rapidly back-pedalled and closed the source again, but not before some devs in the Interbase ecosphere took up the MPL'd code and ran with it to form the Firebird project. Confused? Maybe, but not as much as "management" at Inprise/Borland... :-) Hence my use of the term "fiasco" - it's a real pity, because, get this, they actually had DRI in 2000, and CHECK constraints (which MySQL still doesn't have).Pickard
If they'd GPL'd it à la MySQL, Interbase might have been bought by Sun for $1Bn (and then Oracle...)! Ah, well, sic biscuitus deteriatum! Thanks for the reply - I don't know how I'd do if asked about a decade old Google search!Pickard
This question is too broad because of the significant differences in the implementation of JOINing among DB vendors.Reward
D
538

Denormalising to improve performance? It sounds convincing, but it doesn't hold water.

Chris Date, who in company with Dr Ted Codd was the original proponent of the relational data model, ran out of patience with misinformed arguments against normalisation and systematically demolished them using scientific method: he got large databases and tested these assertions.

I think he wrote it up in Relational Database Writings 1988-1991 but this book was later rolled into edition six of Introduction to Database Systems, which is the definitive text on database theory and design, in its eighth edition as I write and likely to remain in print for decades to come. Chris Date was an expert in this field when most of us were still running around barefoot.

He found that:

  • Some of them hold for special cases
  • All of them fail to pay off for general use
  • All of them are significantly worse for other special cases

It all comes back to mitigating the size of the working set. Joins involving properly selected keys with correctly set up indexes are cheap, not expensive, because they allow significant pruning of the result before the rows are materialised.

Materialising the result involves bulk disk reads which are the most expensive aspect of the exercise by an order of magnitude. Performing a join, by contrast, logically requires retrieval of only the keys. In practice, not even the key values are fetched: the key hash values are used for join comparisons, mitigating the cost of multi-column joins and radically reducing the cost of joins involving string comparisons. Not only will vastly more fit in cache, there's a lot less disk reading to do.

Moreover, a good optimiser will choose the most restrictive condition and apply it before it performs a join, very effectively leveraging the high selectivity of joins on indexes with high cardinality.

Admittedly this type of optimisation can also be applied to denormalised databases, but the sort of people who want to denormalise a schema typically don't think about cardinality when (if) they set up indexes.

It is important to understand that table scans (examination of every row in a table in the course of producing a join) are rare in practice. A query optimiser will choose a table scan only when one or more of the following holds.

  • There are fewer than 200 rows in the relation (in this case a scan will be cheaper)
  • There are no suitable indexes on the join columns (if it's meaningful to join on these columns then why aren't they indexed? fix it)
  • A type coercion is required before the columns can be compared (WTF?! fix it or go home) SEE END NOTES FOR ADO.NET ISSUE
  • One of the arguments of the comparison is an expression (no index)

Performing an operation is more expensive than not performing it. However, performing the wrong operation, being forced into pointless disk I/O and then discarding the dross prior to performing the join you really need, is much more expensive. Even when the "wrong" operation is precomputed and indexes have been sensibly applied, there remains significant penalty. Denormalising to precompute a join - notwithstanding the update anomalies entailed - is a commitment to a particular join. If you need a different join, that commitment is going to cost you big.

If anyone wants to remind me that it's a changing world, I think you'll find that bigger datasets on gruntier hardware just exaggerates the spread of Date's findings.

For all of you who work on billing systems or junk mail generators (shame on you) and are indignantly setting hand to keyboard to tell me that you know for a fact that denormalisation is faster, sorry but you're living in one of the special cases - specifically, the case where you process all of the data, in-order. It's not a general case, and you are justified in your strategy.

You are not justified in falsely generalising it. See the end of the notes section for more information on appropriate use of denormalisation in data warehousing scenarios.

I'd also like to respond to

Joins are just cartesian products with some lipgloss

What a load of bollocks. Restrictions are applied as early as possible, most restrictive first. You've read the theory, but you haven't understood it. Joins are treated as "cartesian products to which predicates apply" only by the query optimiser. This is a symbolic representation (a normalisation, in fact) to facilitate symbolic decomposition so the optimiser can produce all the equivalent transformations and rank them by cost and selectivity so that it can select the best query plan.

The only way you will ever get the optimiser to produce a cartesian product is to fail to supply a predicate: SELECT * FROM A,B


Notes


David Aldridge provides some important additional information.

There is indeed a variety of other strategies besides indexes and table scans, and a modern optimiser will cost them all before producing an execution plan.

A practical piece of advice: if it can be used as a foreign key then index it, so that an index strategy is available to the optimiser.

I used to be smarter than the MSSQL optimiser. That changed two versions ago. Now it generally teaches me. It is, in a very real sense, an expert system, codifying all the wisdom of many very clever people in a domain sufficiently closed that a rule-based system is effective.


"Bollocks" may have been tactless. I am asked to be less haughty and reminded that math doesn't lie. This is true, but not all of the implications of mathematical models should necessarily be taken literally. Square roots of negative numbers are very handy if you carefully avoid examining their absurdity (pun there) and make damn sure you cancel them all out before you try to interpret your equation.

The reason that I responded so savagely was that the statement as worded says that

Joins are cartesian products...

This may not be what was meant but it is what was written, and it's categorically untrue. A cartesian product is a relation. A join is a function. More specifically, a join is a relation-valued function. With an empty predicate it will produce a cartesian product, and checking that it does so is one correctness check for a database query engine, but nobody writes unconstrained joins in practice because they have no practical value outside a classroom.

I called this out because I don't want readers falling into the ancient trap of confusing the model with the thing modelled. A model is an approximation, deliberately simplified for convenient manipulation.


The cut-off for selection of a table-scan join strategy may vary between database engines. It is affected by a number of implementation decisions such as tree-node fill-factor, key-value size and subtleties of algorithm, but broadly speaking high-performance indexing has an execution time of k log n + c. The C term is a fixed overhead mostly made of setup time, and the shape of the curve means you don't get a payoff (compared to a linear search) until n is in the hundreds.


Sometimes denormalisation is a good idea

Denormalisation is a commitment to a particular join strategy. As mentioned earlier, this interferes with other join strategies. But if you have buckets of disk space, predictable patterns of access, and a tendency to process much or all of it, then precomputing a join can be very worthwhile.

You can also figure out the access paths your operation typically uses and precompute all the joins for those access paths. This is the premise behind data warehouses, or at least it is when they're built by people who know why they're doing what they're doing, and not just for the sake of buzzword compliance.

A properly designed data warehouse is produced periodically by a bulk transformation out of a normalised transaction processing system. This separation of the operations and reporting databases has the very desirable effect of eliminating the clash between OLTP and OLAP (online transaction processing ie data entry, and online analytical processing ie reporting).

An important point here is that apart from the periodic updates, the data warehouse is read only. This renders moot the question of update anomalies.

Don't make the mistake of denormalising your OLTP database (the database on which data entry happens). It might be faster for billing runs but if you do that you will get update anomalies. Ever tried to get Reader's Digest to stop sending you stuff?

Disk space is cheap these days, so knock yourself out. But denormalising is only part of the story for data warehouses. Much bigger performance gains are derived from precomputed rolled-up values: monthly totals, that sort of thing. It's always about reducing the working set.


ADO.NET problem with type mismatches

Suppose you have a SQL Server table containing an indexed column of type varchar, and you use AddWithValue to pass a parameter constraining a query on this column. C# strings are Unicode, so the inferred parameter type will be NVARCHAR, which doesn't match VARCHAR.

VARCHAR to NVARCHAR is a widening conversion so it happens implicitly - but say goodbye to indexing, and good luck working out why.


"Count the disk hits" (Rick James)

If everything is cached in RAM, JOINs are rather cheap. That is, normalization does not have much performance penalty.

If a "normalized" schema causes JOINs to hit the disk a lot, but the equivalent "denormalized" schema would not have to hit the disk, then denormalization wins a performance competition.

Comment from original author: Modern database engines are very good at organising access sequencing to minimise cache misses during join operations. The above, while true, might be miscontrued as implying that joins are necessarily problematically expensive on large data. This would lead to cause poor decision-making on the part of inexperienced developers.

Dygert answered 6/10, 2008 at 9:52 Comment(29)
Do you have any references for me to look into (aside from Date's book)?Evolve
Sonme of these statements are specific to a particular DBMS, aren't they? eg. "There are fewer than 200 rows in the relation"Inerrable
I don't think C J Date had anything to do with the creation of SQL, and I'm fairly certain Codd didn't. Do you mean the relational model? If you do mean SQL, I'd like to see references.Theft
"The basis for SQL can be traced down to Dr. E F Codd. In June 1970 he published an article called 'A relational model of data for large shared data banks',which appeared in the ACM magazine, Vol. 13, No. 6, pp. 377-387"Questionable
@Mitch: I'm aware of that paper. It makes no mention of SQL. Where is that quote from?Theft
All this is correct. I just want to point out I wasn't trying to do an apology of denormalization. Only to explain why it can be faster. Assuming same indexes, selects on denormalized tables will be faster no matter what. Updates are a different matter of course.Tritium
This does not mean you always want to denormalize of course, or else we'd all be working on single table dbs :)Tritium
'Bollocks!!' now come on stop, being so snobbish. I was commenting on the mathematical nature of join operation and detailing why it is so. I have also mentioned about additional optimisations, I never said that it is the way to go. Math doesn't lie, it would only be human ignorance that would.Paraphrastic
By the way, our chum Vinko would appreciate it if you'd delete the personal references now that the post you're referring to was deleted to avoid further confusion, especially given the mischaracterization of it (or your lack of reading until the end of it, don't know what's the case)Tritium
No need to apologize too much. Sometimes things are plain wrong and need to be called out. Wrapping it in pleasantries is a waste of time.Farnesol
Appearantly, Mr Codd was not the creator of SQL, but actually of the relational model. He has on more than one occassion voiced his dissatisfaction in the way SQL "perverts" the relational modal. I edited this in the above text.Evolve
Very interesting - I knew Codd peddled the relational model at IBM and I knew SQL came from IBM as a result of this, and from this I incorrectly assumed Codd-->SQL. Live and learn. But SQL!=relational-model is a given. Chris Date doesn't like SQL either.Dygert
I see I made a mistake in my previous comment. I meant mr Date instead of Codd. It's correct in the post, though.Evolve
In the general case the '200' rows boils down to 'how much table data fits into a single disk operation'. The DB will read data in lots of one or more disk blocks (on Oracle, for example, this is configurable). This is one disk op. Reading from and index and then the table is two or more ops.Somber
Table scans can also be chosen when you're trying to seek more than x% of rows. I've heard for Sql Server, that x is anywhere from 10-20%.Duroc
Does the use of surrogate keys (or not) influence all this significantly?Agnomen
The implications of the presence or absence of surrogate keys depends on many things, the most significant of which are already considered in the main text above. Whether they will be used is up to the optimiser. If you fail to provide an index you guarantee that it won't be used. Indexes do impose hefty overheads on updates. In a data warehousing scenario the OLAP database would be a lot more heavily indexed than the normalised OLTP database.Dygert
Your answer is very long, it would be nice to have a short conclusion. Have I understood you correct: Joins are very cheap, but denormalized tables may be faster on reads?Thwack
Database Design for Dummies: Learn about OLAP and OLTP. ALWAYS normalise OLTP. Denormalise OLAP if you think it will help. Experiment, it can be hard to predict. DO NOT UPDATE OLAP directly, always update OLTP and regenerate from there.Dygert
The great E F Codd is solely responsible for the Relational Model. C J Date, and more recently H Darwen, are both idiots, who do not understand the RM, and provide masses of information on "how to improve" the RM, all of which can be dismissed, because one cannot fix what one does not understand. They serve only to damage the relevance of the RM, by suggesting that there is something "missing".Bartholemy
There were some incomplete bits in early SQL, but the RM is complete, for many years now the commercial SQL platforms provide the completions that were unavailable while Codd was alive. Codd worked at IBM when they wrote System R, which is the predecessor to their SQL, which later became a standard. Codd influenced both System R and SQL heavily, not only because the purpose of them was to implement the RM. Learn the RM, and use it faithfully.Bartholemy
This is the first time I have ever seen anyone describe Hugh Darwen and Chris Date as idiots who don't understand the relational model. Since practically everyone learned about the RM from Messrs Date and Darwen, there is a good chance that you are the only person in the whole world who does understand it. Perhaps you can lead us all into the light.Dygert
Interesting post peter. I had to reasearch this for a discussion question for class. I've heard of people using NoSQL for all the wrong reasons (incorrectly setup databases, trying the phillips screwdriver on a flathead screw approach with their SQL database, bad queries, and so on). I had to write a research paper with a group on NoSQL and it has its own limitations (nothing is quite standard between different implementations is the biggest and their was a performance problem I found but need to look it back up).Immunize
I prefer relational database systems because from reasearch most NoSQL systems are complicated (queries on one are different from another).Immunize
And there, jeffery, you touch on the very crux of relational systems: they permit ad hoc queries, yet the methodology is anything but ad hoc. Until you impose orthogonal structure and closure, you can't build a general purpose query optimiser.Dygert
Also, don't forget that many NoSQL databases are essentially the same databases that we discarded 40 years ago. Young people always think they've discovered something new. Fabian Pascal: dbdebunk.com/2014/02/thinking-logically-sql-nosql-and.htmlMercia
I'd add "You're using MySQL" to your list of special cases, as it's query optimiser is piss poor to the point where denormalising is sometimes the only viable choice (though why use mysql, other than ignorance?).France
It is very important to clarify that the relational model was originally developed by a Turing Award recipient, Dr. Edgar Frank Codd, and he published it in 1970 by means of one of the most influential computer science papers ever written, A Relational Model of Data for Large Shared Data Banks.Morbidity
Aggressive. It was a good account, but the aggression's and micro-aggression's don't add to the content, or the value of the content.Wavellite
I
52

What most commenters fail to note is the wide range of join methodologies available in a complex RDBMS, and the denormalisers invariably gloss over the higher cost of maintaining denormalised data. Not every join is based on indexes, and databases have a lot of optimised algotithms and methodologies for joining that are intended to reduce join costs.

In any case, the cost of a join depends on its type and a few other factors. It needn't be expensive at all - some examples.

  • A hash join, in which bulk data is equijoined, is very cheap indeed, and the cost only become significant if the hash table cannot be cached in memory. No index required. Equi-partitioning between the joined data sets can be a great help.
  • The cost of a sort-merge join is driven by the cost of the sort rather than the merge -- an index-based access method can virtually eliminate the cost of the sort.
  • The cost of a nested loop join on an index is driven by the height of the b-tree index and the access of the table block itself. It's fast, but not suitable for bulk joins.
  • A nested loop join based on a cluster is much cheaper, with fewer logicAL IO'S required per join row -- if the joined tables are both in the same cluster then the join becomes very cheap through the colocation of joined rows.

Databases are designed to join, and they're very flexible in how they do it and generally very performant unless they get the join mechanism wrong.

Inerrable answered 6/10, 2008 at 12:48 Comment(2)
I think it comes down to "if in doubt, ask your DBA". Modern databases are complex beasts and do require study to understand. I've only been using Oracle since 1996 and it's a full time job keeping up with the new features. SQLserver has also come along hugely way since 2005. It's not a black box!Radian
Hmmm, well in my humble experience there are too many DBA's out there who have never heard of a hash join, or think that they're a Universally Bad Thing.Inerrable
R
37

I think the whole question is based on a false premise. Joins on large tables are not necessarily expensive. In fact, doing joins efficiently is one of the main reasons relational databases exist at all. Joins on large sets often are expensive, but very rarely do you want to join the entire contents of large table A with the entire contents of large table B. Instead, you write the query such that only the important rows of each table are used and the actual set kept by the join remains smaller.

Additionally, you have the efficiencies mentioned by Peter Wone, such that only the important parts of each record need be in memory until the final result set is materialized. Also, in large queries with many joins you typically want to start with the smaller table sets and work your way up to the large ones, so that the set kept in memory remains as small as possible as long as possible.

When done properly, joins are generally the best way to compare, combine, or filter on large amounts of data.

Rysler answered 6/10, 2008 at 13:53 Comment(1)
@joel. The converse is also true. Large dataset joins can be expensive and are sometimes required, but you don't want to do it too often unless a) you can handle the IO and RAM needed and b) you're not doing it too often. Consider materialised views, reporting systems, realtime vs CoB reports.Radian
D
15

The bottleneck is pretty much always disk I/O, and even more specifically - random disk I/O (by comparison, sequential reads are fairly fast and can be cached with read ahead strategies).

Joins can increase random seeks - if you're jumping around reading small parts of a large table. But, query optimizers look for that and will turn it into a sequential table scan (discarding the unneeded rows) if it thinks that'd be better.

A single denormalized table has a similar problem - the rows are large, and so less fit on a single data page. If you need rows that are located far from another (and the large row size makes them further apart) then you'll have more random I/O. Again, a table scan may be forced to avoid this. But, this time, your table scan has to read more data because of the large row size. Add to that the fact that you're copying data from a single location to multiple locations, and the RDBMS has that much more to read (and cache).

With 2 tables, you also get 2 clustered indexes - and can generally index more (because of less insert/update overhead) which can get you drastically increased performance (mainly, again, because indexes are (relatively) small, quick to read off disk (or cheap to cache), and lessen the amount of table rows you need to read from disk).

About the only overhead with a join comes from figuring out the matching rows. Sql Server uses 3 different types of joins, mainly based on dataset sizes, to find matching rows. If the optimizer picks the wrong join type (due to inaccurate statistics, inadequate indexes, or just an optimizer bug or edge case) it can drastically affect query times.

  • A loop join is farily cheap for (at least 1) small dataset.
  • A merge join requires a sort of both datasets first. If you join on an indexed column, though, then the index is already sorted and no further work needs to be done. Otherwise, there is some CPU and memory overhead in sorting.
  • The hash join requires both memory (to store the hashtable) and CPU (to build the hash). Again, this is fairly quick in relation to the disk I/O. However, if there's not enough RAM to store the hashtable, Sql Server will use tempdb to store parts of the hashtable and the found rows, and then process only parts of the hashtable at a time. As with all things disk, this is fairly slow.

In the optimal case, these cause no disk I/O - and so are negligible from a performance perspective.

All in all, at worst - it should actually be faster to read the same amount of logical data from x joined tables, as it is from a single denormalized table because of the smaller disk reads. To read the same amount of physical data, there could be some slight overhead.

Since query time is usually dominated by I/O costs, and the size of your data does not change (minus some very miniscule row overhead) with denormalization, there's not a tremendous amount of benefit to be had by just merging tables together. The type of denormalization that tends to increase performance, IME, is caching calculated values instead of reading the 10,000 rows required to calculate them.

Duroc answered 3/11, 2008 at 23:33 Comment(2)
Reducing random seeks: good point, although a good RAID controller with a big cache will do elevator read/write.Dygert
The best answer in the thread! Covered the most significant aspects and their effect on disk, CPU and RAM. Though, the conclusion about denormalisation is valid only for reading large data. Modern apps usually handle paginated requests with modest output. In this case denormalisation wins.Ng
K
3

The order in which you're joining the tables is extremely important. If you have two sets of data try to build the query in a way so the smallest will be used first to reduce the amount of data the query has to work on.

For some databases it does not matter, for example MS SQL does know the proper join order most of the time. For some (like IBM Informix) the order makes all the difference.

Kwangchow answered 6/10, 2008 at 9:58 Comment(7)
In general a decent query optimizer is going to be unaffected by the order that the joins or tables are listed, and will make its own determination of the most efficient way to perform the join.Inerrable
MySQL, Oracle, SQL Server, Sybase, postgreSQL,etc. care not the order of joins. I've worked with DB2 and it also, to my knowledge, doesn't care what order you put them in. This is not helpful advice in the general caseFarnesol
MySQL clustering using the NDB engine (admittedly an edge case, and only advanced developers are going to go near NDB) doesn't guess the join order correctly, so you have to add "USE INDEX" statements to most joined queries or they'll be horrendously inefficient. MySQL docs cover it.Hanuman
@iiya, Understanding what the optimiser will choose is more important than generalised statements or "myths" about table ordering. Do not rely on a particular quirk in your SQL as the behaviour often changes when the RDBMS is upgraded. Oracle has changed behaviours several times since v7.Radian
MSSQL uses a pruning algorithm in its optimiser to prevent paralysis by analysis (which apparently happens to machines too) and this can occasionally make join sequence significant.Dygert
Yes, there are edge cases where it can "somewhat" matter (joining many many tables) but in general, it's nothing that most SQL developers should devote any brainspace to whatsoever. I'd rather they worried about indexing and query optimization first.Farnesol
@Matt I've saw Oracle 9i perform very different optimizations and query plans just adjusting the join order. Maybe this has changed from version 10i onwards?Berdichev
S
0

Deciding on whether to denormalize or normalize is fairly a straightforward process when you consider the complexity class of the join. For instance, I tend to design my databases with normalization when the queries are O(k log n) where k is relative to the desired output magnitude.

An easy way to denormalize and optimize performance is to think about how changes to your normalize structure affect your denormalized structure. It can be problematic however as it may require transactional logic to work on a denormalized structured.

The debate for normalization and denormalization isn't going to end since the problems are vast. There are many problems where the natural solution requires both approaches.

As a general rule, I've always stored a normalized structure and denormalized caches that can be reconstructed. Eventually, these caches save my ass to solve the future normalization problems.

Siouxie answered 20/9, 2009 at 12:25 Comment(0)
P
-8

Elaborating what others have said,

Joins are just cartesian products with some lipgloss. {1,2,3,4}X{1,2,3} would give us 12 combinations (nXn=n^2). This computed set acts as a reference on which conditions are applied. The DBMS applies the conditions (like where both left and right are 2 or 3) to give us the matching condition(s). Actually it is more optimised but the problem is the same. The changes to size of the sets would increase the result size exponentially. The amount of memory and cpu cycles consumed all are effected in exponential terms.

When we denormalise, we avoid this computation altogether, think of having a colored sticky, attached to every page of your book. You can infer the information with out using a reference. The penalty we pay is that we are compromising the essence of DBMS (optimal organisation of data)

Paraphrastic answered 6/10, 2008 at 11:9 Comment(3)
-1: This post is a great example of why you let the DBMS perform the joins -- because the DBMS designers think about these issues all the time and come up with more effective ways to do it than the compsci 101 method.Inerrable
@David: Agreed. DBMS optimizer programmers are some smart cookiesFarnesol
This answer is incorrect. If your query is executed against a normalized, indexed database and has any kind of filter or join condition, the optimizer will find a way to avoid the Cartesian product and minimize memory usage and CPU cycles. If you actually intend to select a Cartesian product, you'll use the same memory in a normalized or de-normalized db.Plumose

© 2022 - 2024 — McMap. All rights reserved.