Relational Database is the most critical tool that is used to store the data and SQL is the language used to query data from these databases. SQL is fundamental skillset that one should have to advance in the fields of Data Science and Machine Learning. In this post I will share top 10 most asked interview questions in querying the data. One cannot move to further levels of interview without cracking the SQL questions, so lets dive in to see them.
- Explain the basic data types in SQL
- Explain the anatomy of SQL query
- What is the order of execution in SQL query?
- FROM & JOIN:
These are first executed to determine the total working set of the data. It also includes sub queries in the dataset
This constraint is applied to filter the dataset which satisfy the conditions
- GROUP BY: Data is grouped based on the column mentioned here and the aggregate operation mentioned in the query will be calculated based on this column
- HAVING: Constraints from this command are applied on the GROUP BY results and the rows that doesn’t satisfy are filtered (Similar to a WHERE clause)
- SELECT: Expressions and columns mentioned here will be computed here
- DISTINCT: Duplicate columns will be removed with this command
- ORDER BY: Data will be ordered either in ascending or descending order based on the column mentioned here
- LIMIT/OFFSET: Data that falls outside the mentioned limit or offset are filtered
Numeric, Character, Boolean and Date are the common data type categories found in a any programming language. Knowing these datatypes is required to efficiently utilize the storage space of any database. Below we have discussed each of the data types in these categories.
Numeric is used to store the numbers in the database and below are the data types used in SQL
|small serial||2 bytes|
|big serial||8 bytes|
|double precision||8 bytes|
Character is used to store text related data.
|varchar(n)||variable length within limit|
|text||variable unlimited length|
Datetime is used to store the time based information such as dates and times
Boolean datatype is used to mention TRUE or FALSE. SQL doesn’t have a specific datatype as BOOLEAN but we use Numeric(1) to use boolean values. It takes 1 bit to store the values. TRUE is generally represented as ‘1’ while False is represented as ‘0’.
Generally any SQL query can be broken into 5 parts. Lets analyze the below SQL statements
1. Select * from Students where course=’ABC’ order by ID
2. Select ID, count(*) from Students group by course
SELECT – It describes the Operation to be performed
FROM – It describes the Source of the data, often it is a single table or multiple tables
WHERE – It describes the Condition on which the data needs to be selected
GROUP BY – It is a AGGREGATE Operator which aggregates the data based on the condition provided.
ORDER BY – It is a Post-Processing step, it takes the results and format the data like sorting and limiting
Below is the order of execution of the statements in SQL Query
Know the frequently asked interview questions on Machine Learning using the link ML Interview Questions