This is the solution to Google's python logpuzzle part A and B. The exercise requires you to use python to download image-slices from Apache-server's log files.
- Python version used: 3.7
- IDE used: Spyder
- Modules used: urllib, re, sys, os
- The function read_urls takes the logfile name as the parameter
- We read the contents of the log file and save them into an array
- We use regex to extract the image urls and concatenate the urls with the server name. The complete urls are now saved one-by-one into an array urls
- The read_urls function returns the array of urls
- The download_images function takes the array of urls and the destination folder as parameters. The destination folder is the directory where the images will be stored when downloaded
- We download the images one-by-one using urllib.request.urlretrieve. The naming of the files are done by name extracted using regex
- At the end, we write a html file and append the images into it by using html img tag.