Missing data¶
Objective¶
Write queries that handle missing information (
NULL
) correctly.
Key Points¶
Databases use a special value called
NULL
to represent missing information.Almost all operations on
NULL
produceNULL
.Queries can test for
NULL
usingIS NULL
andIS NOT NULL
.
NULL values¶
In our console, we’re going to make sure NULL
gets displayed more obviously.
.nullvalue -null-
Let’s try running a query where some of the returned values are NULL
.
SELECT * FROM Visited;
id site dated
--- ----- ----------
619 DR-1 1927-02-08
622 DR-1 1927-02-10
734 DR-3 1930-01-07
735 DR-3 1930-01-12
751 DR-3 1930-02-26
752 DR-3 -null-
837 MSK-4 1932-01-14
844 DR-1 1932-03-22
NULL
doesn’t behave like other values.
It gets left out from conditional searches.
SELECT * FROM Visited WHERE dated < '1930-01-01';
id site dated
--- ---- ----------
619 DR-1 1927-02-08
622 DR-1 1927-02-10
SELECT * FROM Visited WHERE dated >= '1930-01-01';
id site dated
--- ----- ----------
734 DR-3 1930-01-07
735 DR-3 1930-01-12
751 DR-3 1930-02-26
837 MSK-4 1932-01-14
844 DR-1 1932-03-22
Filtering NULL¶
To filter for NULL
, we have to use the command IS
instead of =
.
SELECT * FROM Visited WHERE dated IS NULL;
id site dated
--- ---- ------
752 DR-3 -null-
To omit NULL
, we use IS NOT
.
SELECT * FROM Visited WHERE dated IS NOT NULL;
id site dated
--- ----- ----------
619 DR-1 1927-02-08
622 DR-1 1927-02-10
734 DR-3 1930-01-07
735 DR-3 1930-01-12
751 DR-3 1930-02-26
837 MSK-4 1932-01-14
844 DR-1 1932-03-22
Without explicitely saying that we want to include NULL
,
it is easy to filter them out accidentally.
SELECT * FROM Survey where quant = 'sal' and person != 'lake';
taken person quant reading
----- ------ ----- -------
619 dyer sal 0.13
622 dyer sal 0.09
752 roe sal 41.6
837 roe sal 22.5
We have to explicitely include NULL
if we want to also include
those values.
SELECT * FROM Survey where quant = 'sal'
and (person != 'lake' or person IS NULL);
taken person quant reading
----- ------ ----- -------
619 dyer sal 0.13
622 dyer sal 0.09
735 -null- sal 0.06
752 roe sal 41.6
837 roe sal 22.5
Practice: Sorting by Known Date¶
Write a query that sorts the records in Visited
by date, omitting entries for
which the date is not known (i.e., is null).
Solution
SELECT * FROM Visited WHERE dated IS NOT NULL ORDER BY dated ASC;
id site dated
--- ----- ----------
619 DR-1 1927-02-08
622 DR-1 1927-02-10
734 DR-3 1930-01-07
735 DR-3 1930-01-12
751 DR-3 1930-02-26
837 MSK-4 1932-01-14
844 DR-1 1932-03-22