April 15, 2008

JOIN in Oracle 10g

JOIN in Oracle 10g:

JOIN: Oracle 10g As your database grows so will the need to get information from more than one table.

A join is nothing more than writing a query that takes a set of rows from one or more tables and combines them together. The tables that will participate in the join are listed in the FROM clause of the SQL query and then, depending on the syntax we choose, rows from each table are linked by either the WHERE clause or the JOIN clause.

A join is a query that combines rows from two or more tables, views, or materialized views. Oracle performs a join whenever multiple tables appear in the query's FROM clause. The query's select list can select any columns from any of these tables. If any two of these tables have a column name in common, you must qualify all references to these columns throughout the query with table names to avoid ambiguity.

Equijoins or Inner Join:

Table_A and Table_B each have one column named LETTER.

Table_A.letter Table_B.letter

A A

B

C

An equijoin is a join with a join condition containing an equality operator ( = ).

An equijoin combines rows that have equivalent values for the specified columns.

The following SQL joins Table_A against Table_B where they share the same letter value.

SQL >SELECT Table_A.letter, Table_B.letter 2

FROM Table_A, Table_B 3

WHERE Table_A.letter = Table_B.letter;

LETTER LETTER

---------- ----------

A A

Self Joins:

A self join is a special form of equijoin or INNER JOIN where a table is joined against itself. This means that the table must exists two times in the FROM clause of the SQL query. A self join is a join of a table to itself. This table appears twice in the FROM clause and is followed by table aliases that qualify column names in the join condition. To perform a self join, Oracle combines and returns rows of the table that satisfy the join condition.

Left Outer Joins: Often times we need to return rows from one table even if there are no matching rows that are produced through a join condition. For this situation, we use outer joins. A left outer join is where the table, on the left of a FROM clause is required to return all of its rows regardless of having matching rows from the table it is being joined on. So in the following SQL we want to join Table_A to Table_B and show where they are equal on the column LETTER as well as return rows from Table_A that do not have a match on Table_B. The old method in Oracle was to supply the plus sign in parentheses (+) next to all columns in the WHERE clause that may not have values that will match against the table being joined to. Here is the old method in Oracle to produce a left outer join.

SQL >SELECT Table_A.letter, Table_B.letter 2

FROM Table_A, Table_B 3

WHERE Table_A.letter = Table_B.letter(+);

LETTER LETTER

---------- ----------

A A

B

Right Outer Joins:

A right outer join is just the opposite of a left outer join. It states that you would like all rows from the right table in the FROM clause to be returned regardless of having a true match defined in the WHERE clause against the left side table in the FROM clause. Here is the old method in Oracle for producing the right outer join.

SQL >SELECT Table_A.letter, Table_B.letter 2

FROM Table_A, Table_B 3

WHERE Table_A.letter(+) = Table_B.letter;

LETTER LETTER

---------- ----------

A A

C

Full Outer Joins:

The full outer join is a special condition. With Oracle's old method of supplying (+) notation, there is no method to explicitly state in a SQL statement that you want to return both left and right sides of a query regardless of having a match. To accomplish this you must write a left outer join SQL statement and UNION it with a right outer join SQL statement. Doing that produces the following SQL and results.

SQL >SELECT Table_A.letter, Table_B.letter

FROM Table_A, Table_B

WHERE Table_A.letter = Table_B.letter(+)

UNION

SELECT Table_A.letter, Table_B.letter

FROM Table_A, Table_B

WHERE Table_A.letter(+) = Table_B.letter;

LETTER LETTER

---------- ----------

A A

B C

Cartesian Products:

A cartesian join is a query where there is actually no join criteria between the table in the query. What is returned is a cartesian product where each row from a table is matched against every row in the other table. So in our example where Table_A has two rows and Table_B has two rows there will be 2 X 2 or 4 rows returned. The old method was to exclude the WHERE clause. SQL >

SELECT Table_A.letter, Table_B.letter 2

FROM Table_A, Table_B;

LETTER LETTER

---------- ----------

A A

A C

B A

B C

Creating DataFrames from CSV in Apache Spark

 from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CSV Example").getOrCreate() sc = spark.sparkContext Sp...