The main objective of this app is to test query optimisation of a PostgreSQL database and a rails API. The app offers infinite-scrolling using the will-paginate gem. The app is simple, containing one controller and one model. It was designed to be a hybrid: if requested, it returns a JSON like an API. It also has a view which can be accessed via rails server with the index path.
- Clone to local reports
- Run 'bundle install'
- Run 'Rails Server'
Run 'rspec' on the terminal
The following tables are meant to show the improvements made to the database performance, as scalability is meant to be a key component of this app's functionality. The database was seeded with 5000 profiles before running EXPLAIN ANALYZE on various possible PostgreSQL queries.
Thoughts: As there was only one model, and one table, there were no complex relations to slow the query down. On the other hand, it's providing more profiles than needed and in no order, therefore it could be better.
QUERY PLAN Seq Scan on profiles (cost=0.00..158.55 rows=5055 width=138) (actual time=0.059..3.551 rows=5055 loops=1) Planning time: 53.027 ms Execution time: 4.234 ms (3 rows)
Thoughts: Pagination set up the query faster than without pagination and since it is only querying a limited amount of records at a time, it's naturally faster than the previous query.
QUERY PLAN Limit (cost=0.31..0.63 rows=10 width=138) (actual time=0.036..0.041 rows=10 loops=1) -> Seq Scan on profiles (cost=0.00..158.55 rows=5055 width=138) (actual time=0.031..0.035 rows=20 loops=1) Planning time: 0.275 ms Execution time: 0.079 ms (4 rows)
Thoughts: Social media feeds always have some sort of order to them. For this, I figured the most useful ordering system might be the updated_at column. Using sort on this column adds a lot of time. It also adds memory usage.
QUERY PLAN Limit (cost=293.09..293.11 rows=10 width=138) (actual time=5.933..5.936 rows=10 loops=1) -> Sort (cost=293.06..305.70 rows=5055 width=138) (actual time=5.930..5.932 rows=20 loops=1) Sort Key: updated_at DESC Sort Method: top-N heapsort Memory: 30kB -> Seq Scan on profiles (cost=0.00..158.55 rows=5055 width=138) (actual time=0.076..3.826 rows=5055 loops=1) Planning time: 0.403 ms Execution time: 6.000 ms (7 rows)
Thoughts: In order to set up the index scan, a lot of time is lost up-front. After this, future queries are much quicker. As the offset increases, there are fewer records to look through which means that the pagination speeds up overall.
QUERY PLAN Limit (cost=0.78..1.27 rows=10 width=138) (actual time=0.813..0.824 rows=10 loops=1) -> Index Scan Backward using index_profiles_on_updated_at on profiles (cost=0.28..251.12 rows=5055 width=138) (actual time=0.800..0.817 rows=20 loops=1) Planning time: 20.358 ms Execution time: 0.980 ms (4 rows)
QUERY PLAN Limit (cost=5.24..5.74 rows=10 width=138) (actual time=0.104..0.128 rows=10 loops=1) -> Index Scan Backward using index_profiles_on_updated_at on profiles (cost=0.28..251.12 rows=5055 width=138) (actual time=0.021..0.115 rows=110 loops=1) Planning time: 0.143 ms Execution time: 0.207 ms (4 rows)
Early on, I had made the decision to use the will_paginate ruby gem. While this made pagination easier, it uses an offset parameter which numerous blogs seems to indicate decreases performance. If I had more time, I would remove the gem and look into using the seek method with an index.
One table in use: "public.profiles"
Column | Type | Modifiers |
---|---|---|
id | integer | not null default nextval('profiles_id_seq'::regclass) |
name | character varying | |
geolocation | character varying | |
photo | character varying | |
created_at | timestamp without time zone | not null |
updated_at | timestamp without time zone | not null |
Indexes: "profiles_pkey" PRIMARY KEY, btree (id) "index_profiles_on_updated_at" btree (updated_at)