sql server - Which order of joins is faster? -


i'm looking @ ms sql server database developed company expert @ database design (or i'm told) , noticed curious pattern of joins/indexes. it's upside down have done, wonder if has performance benefits (the db big).

the table structure (simplified pseudocode) is:

table jobs (about 1k rows):

  • job_id [int, primary key]
  • server_id [int, foreign key]
  • job_name [string]

table job_history (about 17m rows):

  • history_id [int, primary key]
  • job_id [int, foreign key]
  • server_id [int, foreign key]
  • job_start [datetime]
  • job_duration [int]

note denormalization server_id in both tables.

what did is:

select     t1.job_name, t2.job_start, t2.job_duration     jobs t1 inner join      job_history t2 on (t1.job_id = t2.job_id , t1.server_id = t2.server_id)     t1.server_id = @param_server_id     , t2.job_start >= @param_from     , t2.job_start <= @param_to 

and have indexes:

  • jobs => (server_id)
  • job_history => (job_id, server_id, job_start)

in other words, when select rows, first filter jobs jobs table , relevant job_history entries. db forced do, because of indexes.

what have done bottom-up version:

select     t1.job_name, t2.job_start, t2.job_duration     job_history t2 inner join      jobs t1 on (t1.job_id = t2.job_id)     t2.server_id = @param_server_id     , t2.job_start >= @param_from     , t2.job_start <= @param_to 

and single index:

  • job_history => (server_id, job_start)

so, basically, directly select relevant rows large job_history , attached data jobs table.

is there reason prefer 1 on other?

well, bit bored thought i'd re-create you. first setup (i'm using numbers table generate 1k , 17m rows, of course, random data , doesn't represent system :) i'm assuming theres clustered index on each table, though imply wouldn't have one.

    use tempdb;     go     drop table if exists #jobs;     drop table if exists #job_history;      create table #jobs         (          job_id int identity primary key         ,server_id int         ,job_name varchar(50)         );      create table #job_history         (          history_id int identity primary key         ,job_id int         ,server_id int         ,job_start datetime default sysdatetime()         ,job_duration int default abs(checksum(newid())) % 5000         );     go      insert  #jobs             select  server_id = n.n                    ,job_name = convert(varchar(50), newid())                dba.dim.numbers n               n < 1000;      insert  #job_history             ( job_id             ,server_id             )             select  job_id = j1.job_id                    ,server_id = j1.server_id                #jobs j1                     cross join dba.dim.numbers n               n < 17000; 

now, case 1 (their way)

    drop index if exists idx_job_hist on #job_history;     create nonclustered index idx_job_hist on #job_history (job_id, server_id, job_start);      dbcc freeproccache     dbcc dropcleanbuffers      declare @param_server_id int = 1234     declare @param_from int = 500     declare @param_to int = 1000      select         t1.job_name, t2.job_start, t2.job_duration             #jobs t1     inner join          #job_history t2 on (t1.job_id = t2.job_id , t1.server_id = t2.server_id)             t1.server_id = @param_server_id         , t2.job_start >= @param_from         , t2.job_start <= @param_to; 

and case 2 (your way)

    drop index if exists idx_job_hist on #job_history;     create nonclustered index idx_job_hist on #job_history (server_id, job_start);      select         t1.job_name, t2.job_start, t2.job_duration             #job_history t2     inner join          #jobs t1 on (t1.job_id = t2.job_id)             t2.server_id = @param_server_id         , t2.job_start >= @param_from         , t2.job_start <= @param_to; 

and (totally non-conclusive, because system isn't system...) results:

their plan: enter image description here

your plan: enter image description here

the costs plan higher overall.

but rather artificial exercise prove point - run plans, answer - depends.

(thanks excuse play this, fun :)


Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -