Call Us: US - +1 845 478 5244 | UK - +44 20 7193 7850 | AUS - +61 2 8005 4826

reporting across multiple databases.

Performance Tuning SQL Queries

Starting here? This lesson is part of a full-length tutorial in using SQL for Data Analysis. Check out the beginning.

The lesson on subqueries introduced the idea that you can sometimes create the same desired result set with a faster-running query. In this lesson, you’ll learn to identify when your queries can be improved, and how to improve them.

The theory behind query run time

A database is a piece of software that runs on a computer, and is subject to the same limitations as all software—it can only process as much information as its hardware is capable of handling. The way to make a query run faster is to reduce the number of calculations that the software (and therefore hardware) must perform. To do this, you’ll need some understanding of how SQL actually makes calculations. First, let’s address some of the high-level things that will affect the number of calculations you need to make, and therefore your querys runtime:

  • Table size: If your query hits one or more tables with millions of rows or more, it could affect performance.
  • Joins: If your query joins two tables in a way that substantially increases the row count of the result set, your query is likely to be slow. There’s an example of this in the subqueries lesson.
  • Aggregations: Combining multiple rows to produce a result requires more computation than simply retrieving those rows.

Query runtime is also dependent on some things that you can’t really control related to the database itself:

  • Other users running queries: The more queries running concurrently on a database, the more the database must process at a given time and the slower everything will run. It can be especially bad if others are running particularly resource-intensive queries that fulfill some of the above criteria.
  • Database software and optimization: This is something you probably can’t control, but if you know the system you’re using, you can work within its bounds to make your queries more efficient.

For now, let’s ignore the things you can’t control and work on the things you can.