Missing data¶
Objective¶
Write queries that handle missing information (
NULL) correctly.
Key Points¶
Databases use a special value called
NULLto represent missing information.Almost all operations on
NULLproduceNULL.Queries can test for
NULLusingIS NULLandIS NOT NULL.
NULL values¶
In our console, we’re going to make sure NULL gets displayed more obviously.
.nullvalue -null-
Let’s try running a query where some of the returned values are NULL.
SELECT * FROM Visited;
id site dated
--- ----- ----------
619 DR-1 1927-02-08
622 DR-1 1927-02-10
734 DR-3 1930-01-07
735 DR-3 1930-01-12
751 DR-3 1930-02-26
752 DR-3 -null-
837 MSK-4 1932-01-14
844 DR-1 1932-03-22
NULL doesn’t behave like other values.
It gets left out from conditional searches.
SELECT * FROM Visited WHERE dated < '1930-01-01';
id site dated
--- ---- ----------
619 DR-1 1927-02-08
622 DR-1 1927-02-10
SELECT * FROM Visited WHERE dated >= '1930-01-01';
id site dated
--- ----- ----------
734 DR-3 1930-01-07
735 DR-3 1930-01-12
751 DR-3 1930-02-26
837 MSK-4 1932-01-14
844 DR-1 1932-03-22
Filtering NULL¶
To filter for NULL, we have to use the command IS
instead of =.
SELECT * FROM Visited WHERE dated IS NULL;
id site dated
--- ---- ------
752 DR-3 -null-
To omit NULL, we use IS NOT.
SELECT * FROM Visited WHERE dated IS NOT NULL;
id site dated
--- ----- ----------
619 DR-1 1927-02-08
622 DR-1 1927-02-10
734 DR-3 1930-01-07
735 DR-3 1930-01-12
751 DR-3 1930-02-26
837 MSK-4 1932-01-14
844 DR-1 1932-03-22
Without explicitely saying that we want to include NULL,
it is easy to filter them out accidentally.
SELECT * FROM Survey where quant = 'sal' and person != 'lake';
taken person quant reading
----- ------ ----- -------
619 dyer sal 0.13
622 dyer sal 0.09
752 roe sal 41.6
837 roe sal 22.5
We have to explicitely include NULL if we want to also include
those values.
SELECT * FROM Survey where quant = 'sal'
and (person != 'lake' or person IS NULL);
taken person quant reading
----- ------ ----- -------
619 dyer sal 0.13
622 dyer sal 0.09
735 -null- sal 0.06
752 roe sal 41.6
837 roe sal 22.5
Practice: Sorting by Known Date¶
Write a query that sorts the records in Visited by date, omitting entries for
which the date is not known (i.e., is null).
Solution
SELECT * FROM Visited WHERE dated IS NOT NULL ORDER BY dated ASC;
id site dated
--- ----- ----------
619 DR-1 1927-02-08
622 DR-1 1927-02-10
734 DR-3 1930-01-07
735 DR-3 1930-01-12
751 DR-3 1930-02-26
837 MSK-4 1932-01-14
844 DR-1 1932-03-22