A Humble Request from a Former DBA
Querying a database is a thing that most people aren’t too familiar with unless they are in the tech world. The funny thing is that most people are doing it everyday without even realizing it. Every time you pull up Google to search for funny cat pictures or that word you can’t remember how to spell, you are querying Google’s database. The search term that you input is being sent to Google’s databases in the form of SQL (Structured Query Language) and the search results are returned back to your browser.
Now try to imagine for a minute, how does Google go through the MASSIVE amounts of data to find the explanation for that new meme you don’t quite understand in milliseconds. The answer, my friend, is that Google uses indexes to optimize the speed of the searches that they are getting. According to what I just googled, an index is a data structure that improves the speed of data retrieval operations on a database table. An index can be used to efficiently find all rows matching some column in your query and then walk through only that subset of the table to find exact matches.
You probably just read that and went okay, but why do I care about that? I’ll tell you why. Indexes are hugely important for any programmer that works with databases. Nobody likes querying a table, then having to wait 15 minutes before getting back the information that they need now. Creating an index (or indexes) can make those results come back in seconds, if they are built correctly.
I want to show you an example of how to use the ActiveRecord class in Ruby to add an index to a table.
Run ‘rake db:migrate’ in your terminal to get the new table and indexes out there.
I used Faker to populate this table with data using “rake db:seed”:
Now, it’s time to test out our new indexes!
A good rule of thumb to use when creating indexes is to use the most unique column name in the WHERE statement as the column in the index, then use the next most unique column, etc. The query that I wrote looks for the character name and the city where they live in the WHERE statement.
As you can see from the explain, we are using the index, index_thrones_on_character_and_city, when we run the find_city_where_character_lives method. The two other indexes weren’t chosen because ActiveRecord chooses the index that most closely matches the WHERE statement of the query. Unfortunately, the explain in ActiveRecord doesn’t seem to give us the amount of time that each query took, so that we can compare how long each query takes.
The worst thing that can happen if you don’t use an index is that it will do a table scan, meaning that it will look at each row individually instead of using an index.
If you’re curious, this is what happens if you query the unique id column that is made by default when you create a new table in ActiveRecord.
It is very valuable for a programmer to be aware of how effective and efficient their calls to databases are. The example that I used is a small table, but when you run an inefficient query against millions and billions of rows. The queries will take a long time to run and you won’t be happy.
My humble request as a former Database Administrator is that you use this introduction to indexes to help make your queries run faster and more efficiently. Your next DBA will thank you!
Sources:
https://blog.carbonfive.com/2016/11/16/rails-database-best-practices/
https://guides.rubyonrails.org/active_record_querying.html#running-explain
https://github.com/stympy/faker
https://github.com/stympy/faker/blob/master/doc/v1.9.1/game_of_thrones.md