Joins and Implementation of relationships in sql

We have three types of relationships in SQL.

  • one-to-one
  • one-to-many
  • many-to-many
  • To implement these relationships

    One-to-one: Use foreign key to the referenced table
    Example:
    Student: sid, fname, lname, addressId
    Address: addressId, adress, city, sid

    One-to-many: Use foreign key on the “many” side of the relationship linking back to the “one” side.
    Example:
    Teacher: tid, fname, lname
    Classes: cid, cname, tid    
    Note: Here table ‘Teacher’ is one side and table ‘Classes’ is on many side

    Many-to-many: Use junction table.
    The junction table stores the primary keys of each table that is involved in the relationship.
    Example:
    Student: sid, fname, lname
    classes: cid, cname, tid
    student_classes: cid, sid

    Joins

    Here is a very good chart of all joins (and its variations) in SQL
    joins
    Reference: http://stackoverflow.com/questions/38549/difference-between-inner-and-outer-joins

    What is the difference between sub query and co-related subquery

    Sub Query:
    Here inner query gets executed first and based on the result of it the outer query is executed.

    This follows BOTTOM_UP approach, first BOTTOM query (inner query) executes and then UPper query (outer query) executes.

    Problem Statement: Get all the employees whose department name is ‘A’
    Tables are:
    Employee_Table: emp_id_col, emp_name_col, emp_sal_col, emp_dept_id_col
    Dept_table: dept_id_col, dept_name_col, dept_loc_col
    Here dept_id the foreign key

    select * from Employee_Table where 
          empt_dept_id_col = 
          (select dept_id_col from Dept_table where dept_name_col='A')
    

    Correlated Sub Query:
    I would say this as a kind of SELF JOIN. First outer query executes and then inner query and inner query gets evaluated for each row processed by the outer query.

    This follows TOP-BOTTOM approach
    Problem Statement: Get the list of employees whose salary is greater than the average salary of every department.
    Tables are:
    Employee_Table: emp_id_col, emp_name_col, emp_sal_col, emp_dept_id_col

    select * from Employee_Table e where emp_sal_col > 
       (select avg(emp_sal_col) from Employee_Table m 
            where e.emp_dept_id_col = m.emp_dept_id_col)
    

    Here in outer query the table alias ‘e’ gets evaluated before the inner query executes.

     

    Find nth maximum salary of an employee.

    This is another example of co-related sub query and most frequently asked interview question.

    emp_id_col emp_name_col emp_sal_col
    1 A 100
    2 B 200
    3 C 300
    4 D 400
    select emp_sal_col from Employee_Table e1 
      where 
        N-1 
        = 
        (select count(distinct(emp_sal_col)) from Employee_Table e2 
         where e1.emp_sal_col > e2.emp_sal_col)
    // Where N is nth highest salary that we want to find.
    

    Explanation:
    As correlated subquery operates in top_bottom approach, first outer query executes and then inner query. Here, for every row of outer query, inner query is executed.
    Let’s say we want to find 3rd highest salary(i.e, 200)
    : First row with employee id ‘1’ and sal ‘100’ from outer query is passed to the inner query. In inner query we check count(distinct(sal)) which is ‘4’ where outerEmployeeTableView.sal > innerEmployeeTableView.sal (it means salaries greater than ‘100’) which is ‘3’. Since N-1, that is (3-1 = 2), does not match we go for next row.
    : Second row with employee id ‘2’ and sal ‘200’ from outer query is passed to the inner query. In inner query we check count(distinct(sal)) which is ‘4’ where outerEmployeeTableView.sal > innerEmployeeTableView.sal (it means salaries greater than ‘200’) which is ‘2’. Since N-1, that is (3-1 = 2), matches, outer query returns the salary ‘200’ as output which is 3rd highest salary in the table.

    When should we use GROUP BY clause

    When there is a need to aggregate/group values in a column based on some criteria in same/another column then we go for GROUP BY clause.

    Usage of Aggregate function on column: col_sales:

    select sum(col_sales) from Table_Sales;

    Output: here the output would just one value

     

    Usage of Aggregate function on column ‘col_sales’ based on date in column ‘col_sales_date’

    Problem Statement: Calculate the total sales on day by day basis (OR daily sales) from Table_Sales table
    Query:
    select col_sales_date, sum(col_sales) from Table_Sales GROUP BY col_sales_date;

    Output: here the output would be number of rows proportional to the number of different dates in the column ‘col_sales_date’

    frequent sql interview questions

    difference between ‘where’ and ‘having’ clause

    ‘where’ clause is used to apply a condition on an column directly.

    Usage
    select * from table_emp where col_emp_sal<10000;

    ‘having’ clause is used to apply a condition on a aggregate column.

    Usage
    select sum(col_emp_sal) as aggr_col_emp_sal from table_emp having aggr_col_emp_sal<10000;

    Note: In the above query we cannot use where clause on aggregate alias ‘aggr_col_emp_sal’

    difference between ‘distinct’ and ‘group by’ clauses

    ‘distinct’ keyword is used to remove duplicates
    D1.select distinct(col_dept_id) from Table_Employee;  

    ‘group by’ enables us to use aggregate functions, see in more detail here
    G1: select col_sales_date, sum(col_sales) from Table_Sales GROUP BY col_sales_date;
    G2: select sum(col_sales) from Table_Sales;

    If D1&G2 is considered, both returns the same result but the way they work internally, their execution plans varies across the database servers.

    In MS SQL: The execution plan for both the queries is same and hence there wont be any difference
    Doing an “EXPLAIN SELECT DISTINCT” shows “Using where; Using temporary ” MySQL will create a temporary table.

    vs a “EXPLAIN SELECT a,b, c from T1, T2 where T2.A=T1.A GROUP BY a” just shows “Using where”

    difference between UNION and UNION ALL?

    UNION Combines Rows From the Two Tables and Excludes all Duplicates Rows in the Result. Execution time wise, this is slower than UNION ALL since it has to eliminate duplicates.
    UNION All Returns all Rows in the UNION Operation, including Duplicates. Execution wise it is faster than UNION but it can perform slower at times due to the greater amount of data transferred across the network.

    Transaction and ACID properties

    A set of statements executed on a resource or resources applied with ACID properties is known as Transaction.

    Note: Here database is one resource but resource need not be only database

    ACID Properties:

    • Atomicity
    • Consistency
    • Isolation
    • Durability

    Atomicity:
    This property describes that the set of statements involved in the transaction has to work as a single unit (i.e, automic). This is the minimum requirement that has to be met for a transaction

    Consistency:
    If the data used in the transaction is consistent before starting the transaction, then it has to be left in a consistent state even after the transaction.
    Note: What is consistent data: data applied with some rules.
    Example:

    accountNo | accountBal
    101        | 6000

    Here lets define a rule that says if: accountBal>5000; then it is said to be in consistent state.
    so before transaction it is 6000 and after transaction lets say it has become 8000, then it is left in consistent state. If it falls below 5000, ie, lets say 4000, then it is said to be inconsistent state in which case we will rollback the transaction

    And if the data used in the transaction is already in inconsistent state then we cannot guarantee the consistent data after the transaction.

    Isolation:
    It is an ability to isolate the data used in one transaction from the other transaction.

    Durability:

    Once the transaction starts to save, it should definitely save all the statements within the transaction at any cost else rollback all the statements and throw exception.

    Mawazo

    Mostly technology with occasional sprinkling of other random thoughts

    amintabar

    Amir Amintabar's personal page

    101 Books

    Reading my way through Time Magazine's 100 Greatest Novels since 1923 (plus Ulysses)

    Seek, Plunnge and more...

    My words, my world...

    ARRM Foundation

    Do not wait for leaders; do it alone, person to person - Mother Teresa

    Executive Management

    An unexamined life is not worth living – Socrates

    javaproffesionals

    A topnotch WordPress.com site

    thehandwritinganalyst

    Just another WordPress.com site

    coding algorithms

    "An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem." -- John Tukey

    %d bloggers like this: