Lazy-Loading Sites w/ Incomplete Data
Closed this issue · 1 comments
This seems somewhat relevant to #856 but wanted to highlight a specific issue that I noticed when searching through the results (awesome tool btw)!
Goal:
Support scanning of sites that render content after initial page load. When looking through the site-scanning results for one of the sites I manage, I noticed the USWDS score was much lower than expected. Upon further investigation, I found that the initial response from the webserver was an abbreviated set of HTML and javascript files which in turn, rendered additional content. I created a gist that shows the initial server response and the fully rendered page (a difference of ~10k lines).
The labs.gsa.gov/s site by my estimation should have a USWDS score of:
eval criteria | score | site-scan code ref | gist ref |
---|---|---|---|
.usa classes |
40 | https://github.com/18F/domain-scan/blob/master/scanners/uswds2.py#L36 | multiple |
Source Sans | 5 | https://github.com/18F/domain-scan/blob/master/scanners/uswds2.py#L100 | https://gist.github.com/mvogelgesang/94b137577c44f7e3fbf2fd9e4dd65c53#file-after-page-load-html-L9402 |
uswds in css body | 20 | https://github.com/18F/domain-scan/blob/master/scanners/uswds2.py#L115 | multiple |
uswds version in body | 20 | https://github.com/18F/domain-scan/blob/master/scanners/uswds2.py#L125 | https://gist.github.com/mvogelgesang/94b137577c44f7e3fbf2fd9e4dd65c53#file-after-page-load-html-L9166 |
TOTAL | 85 |
Tasks:
- Determine best way to identify sites who's initial response results in additional javascript (or other actions) that dynamically render the page.
- Allow for page rendering and then perform scan activities- specifically for USWDS and theme criteria
Acceptance Criteria:
- A lazy-loading page would return appropriate score resulting from a fully-loaded page
Moving this issue over to GSA/site-scanning#35