Review:
1. Find employees who make more than all managers
2. Find the maximum average salaries of all departments
3. Find the number of columns in EMP table
4. Find duplicate records in the table created by the following scripts:
drop table duplicates;
create table duplicates ( x number(2), y char(1), z date );
insert into duplicates Values (12, 'Y', '12-Jan-1998');
insert into duplicates Values (12, 'Y', '12-Jan-1998');
insert into duplicates Values (10, 'N', '12-Jan-2008');
insert into duplicates Values (12, 'Y', '12-Jan-1998');
Lecture 1: Union, Minus, and Intersect (No Detailed Lecture Notes)
Example 1: Show all employees and their annual income (do not use NVL function)
Example 2: Find the department that has no employees
Example 3: Find duplicate records between two tables
Example 4: Find how much is needed to have 2% annual increase for all the salesmen, 4% increase for all clerks, and 6% increase for the rest of the employees
--the following is a wrong answer
select sum(inc) from (
(select 0.02*sal as inc from emp where job='SALESMAN') union
(select 0.04*sal as inc from emp where job = ‘CLERK’)
union
(select 0.06*sal as inc from emp where job not in (‘CLERK’,'SALESMAN'))
);
Lecture 2: Tuple Variables (No detailed Lecture Notes)
Example 1: Create a directory that shows employee names, jobs, departments, as well names of the mangers
Example 3: Find the employee who has the longest history in the company at each location
Example 4: Find the employee in each department who has the smallest salary in the corresponding department.
Example 5: Give a $200 salary increase to the person who has the smallest salary in each location:
Example 6: Find the names of those employees who have salary more than the average of their respective departments.
Lecture 3: Copy Data from one table to another (no detailed lecture notes)
Create table as Select ….
Insert into …. Select …
- Example 1: Create a new table called salesman based on the information in EMP table
- Example 2: Create a table called CS102, which include the students registered in Computer Science 102. (Get student id, name, and major from students, registered_students, and classes tables)
- Example 3: Copy all clark’s salary, comm., and job into temp table
Additional Exercises:
- Create a new table called CLERKS based on EMP table so that it contains all the clerks in EMP
- Copy the name, salary, and commission of each employee hired before 1980 into TEMP table
- Copy the dname and the total salary of each department into TEMP_TABLE table
- Find the clerk who has the highest salary among all clerks
- Create a table based on EMP and DEPT tables such that it list each employee’s name, job, hire date, department name, and location
- Find departments sound like OPRATION
- Give all employees under BLAKE’s supervision 5% salary increase and 1000 bonus commission
- Which job has the most employees?
- Delete the salesman who has the highest salary among all salesmen
Lecture 4: Fuzzy Search using soundex(), UTL_MATCH package, and contains()
Practice Questions on Functions and Expressions:
Example 1: Find the annual salary of each employee
Select ename, 12*sal + nvl(comm,0) from emp;
Or
Select ename, 12*sal + decode(comm, null, 0, comm) from emp;
Example 2: Find the department whose total annual salary is more 20000
Example 3: Find the annual income (annual salary + commission) of each employee
Example 4: Find the job whose total annual income is more than 60000
Example 5: Find the hire year of each employee
Example 6: Find the number of years being hired for each employee
Example 7: Find the number of employees being hired each month
Sounds Like: soundex()
Example 1: Find the job that sounds like ‘clark’
Example 2: Find the employees whose name sounds like ‘smee’
Spells like: UTL_MATH package
UTL_MATCH package has several functions that help find fuzzy matching of words: edit_distance() finds the Levenshtein distance between two strings, i.e., the minimum number of single-character edits (insertions, deletions or substitutions) required to change one string into the other, JARO_WINKLER() finds the measure of agreement between two strings based on the algorithm originally proposed as Jaro distance in 1989 by Matthew Jaro and a variant was added in 1999 by William Winkler. Here are examples of finding the distances and similarities between two strings.
select utl_match.edit_distance ( 'Steven' , 'Stephen' )
from dual;
select utl_match.edit_distance_similarity ( 'Steven' , 'Stephen' )
from dual;
select cast (utl_match.jaro_winkler ( 'Steven' , 'Stephen' ) as number(5,5))
from dual;
select utl_match.jaro_winkler_similarity ( 'Steven' , 'Stephen' )
from dual;
Find employees whose names are similar to SMYS
select ename, utl_match.edit_distance_similarity(ename, 'SMYS'), utl_match.jaro_winkler_similarity(ename, 'SMYS') from emp where utl_match.jaro_winkler_similarity(ename, 'SMYS') > 50;
Find employees who have similar names
select a.ename, b.ename from emp a, emp b where a.empno <> b.empno and utl_match.jaro_winkler_similarity(a.ename, b.ename) > 70;
Find similar records based on jaro_winkler_similarity of all text columns
--find all text columns
select column_name, data_type from user_tab_columns where table_name = 'EMP' and data_type in ('CHAR', 'VARCHAR', 'VARCHAR2');
--find similarity threshod to 70 select * from emp a left join emp b on (utl_match.jaro_winkler_similarity(a.ename, b.ename) >=70 and utl_match.jaro_winkler_similarity(a.job, b.job) >=70) where a.empno != b.empno;
Lecture 5: Text Search
The function contains(string1, string2, label) evaluate whether string1 contains string2. It can used to search whether an article, string1, contains certain words, string2. The label parameter is a number, which can be used to retrieve the match score using score() function.
In order to use contains() to search a text column, you will need to create an index for the column first. For example,
create index respindex on responses (text)
indextype is ctxsys.context
parameters ('Sync (on commit)');
The contains() is a scoring function that measures how much is the match. Then you can search the text as follows:
select ID, text, scorer(1)
from responses
where contains(text, ‘unix’, 1) > 0;
- Search for exact match of multiple words (use AND and OR)
Example 1: Find the responses that contain word ‘unix’ or ‘VC++’
Select text From responses
Where contains(text, ‘unix OR vc++’) > 0;
Example 2: Find the responses that contains both words ‘unix’ and ‘windows’
Select text From responses
Where contains(text, ‘unix AND windows’) > 0;
Example 3: Find the responses that contains both words ‘unix’ and “windows” or ‘VC++’
Select text From responses
Where contains(text, ‘VC++ or (unix AND windows)’) > 0;
Note: If a phrase contains the words AND or OR in it, use {and} or {or}
- Search for words that near each other (use key word near)
Example: Find the responses in which the words unix is close to windows
Select text From responses
Where contains(text, ‘unix NEAR windows’) > 0;
- Search for words that share the same stem (work, works, working, worked, … Use $ in front the stem)
Example: Find the responses which contains a variant of the word ‘support’
Select text From responses
Where contains(text, ‘$support’) > 0;
- Search for fuzzy matches (use ? in front of the fuzzy words)
Example: Find the responses which contains a word like vaariable
Select text From responses
Where contains(text, ‘?vaariable’) > 0;
- Search for words that sounds like other words
Example: Find the responses which contains a words sounds like unix:
Select text From responses
Where contains(text, ‘!uniqx’) > 0;
Homework (submit a Word file with question number, the question, your answer, a query result screen shot):
- Find the employee in each job who makes the smallest salary in their job titles
- Find employees who make less than their departmental maximum
- Find those with manager job title but have no supervisees
- Find the total amount of money needed in order to give 5% salary increase to all mangers and 4% increase to everybody else
- Find each employee who is making more than the average of his or her job line
- Give 500 commission to the employee who has the longest employment in the company
- Create a table based on EMP and DEPT tables such that it list each employee’s name, job, hire date, department name, and location.
- Give everybody 10% salary increase in the departments whose maximum salaries are less than 2000
- For all the employees supervised by KING, give them a salary increase, which is equal to the 2% of the company’s average salary
- Find possible duplicate records in the EMP table using similar enames, job title, and salary numbers (use abs(a - b) for the absolute difference between numbers a and b)
|