/Python-Monthly-Audit-Tracker

Python: Monthly Audit Tracker

Primary LanguageJupyter Notebook

Projct1: Python Monthly Audit Tracker

Monthly Audit Issue Tracker by using Python-docx package

Table of Contents

  1. Introduction
  2. Goal
  3. Data Validation
  4. Script Details
    4.1 Data Preparation
    4.2 Body

1. Introduction

  • For every first date of the month, track company's audit action plans status in order to prevent overdue or aging.

2. Goal

  • Generate monthly Audit Action Plan Tracking Report in Python.

Objects:

  1. Perform data cleaning for a raw data from Database.
  2. Filter and generate data in certain format.
  3. Configure word document formats: such as font, style, color and margin.

Features:

Result:

Information Protection Department Report Sample


3. Data Validation

! Every record in the data is made up and does not reflect the real-life information !

FIELDS:

  • Project ID (Format YYYY-###): Identification for each project
  • Project Name: Audit project name
  • Issue Number: Issue numbering
  • DepartmentResponsible: Name of department audited
  • Action Status: Status of action plan as ("Open"|"Closed"|"Pending Verification")
  • Target Date: Initial target date to complete action plan
  • Revised Target Date: Target date when extedned one time or Latest Revised Target Date
  • 1st Revised Target Date: Target date when extended two times
  • 2nd Revised Target Date: Target Date when extended three times
  • 3rd Revised Target Date: Target Date when extended four times
  • Management Action: Action Detail in Issue Number

4. Script Details

4.1 Data Preparation


#Revised Date chronic order
for i in range(0, len(Data_Repository)):
    if pd.isnull(Data_Repository.iloc[i]['3rd Revised Target Date']) == False:
        list = [Data_Repository.iloc[i]['1st Revised Target Date'], Data_Repository.iloc[i]['2nd Revised Target Date'], Data_Repository.iloc[i]['3rd Revised Target Date']]
        list.sort(key = lambda date: datetime.strptime(date, '%m.%d.%y'))
        Data_Repository.loc[i, '1st Revised Target Date'] = Data_Repository.loc[i, '1st Revised Target Date'] = list[0]
        Data_Repository.loc[i, '2nd Revised Target Date'] = Data_Repository.loc[i, '2nd Revised Target Date'] = list[1]
        Data_Repository.loc[i, '3rd Revised Target Date'] = Data_Repository.loc[i, '3rd Revised Target Date'] = list[2]
    elif pd.isnull(Data_Repository.iloc[i]['2nd Revised Target Date']) == False:
        list = [Data_Repository.iloc[i]['1st Revised Target Date'], Data_Repository.iloc[i]['2nd Revised Target Date']]
        list.sort(key = lambda date: datetime.strptime(date, '%m.%d.%y'))
        Data_Repository.loc[i, '1st Revised Target Date'] = Data_Repository.loc[i, '1st Revised Target Date'] = list[0]
        Data_Repository.loc[i, '2nd Revised Target Date'] = Data_Repository.loc[i, '2nd Revised Target Date'] = list[1]

Data Error Sample

  • Notice that "Payment Processing" project has error in Target Date; 1st Revised Target Date comes after 2nd Revised Target Date.
    • This is due to human error, entering the date in wrong order.
  • Since "Target Date" and "Revised Target Date" are not influenced by this error, simply re-arrange 1st & 2nd & 3rd Revised Target Date in chronic order by using list.sort.

4.2 Body

Every code for 4. Body is inside of function called body. Thus, be mindful of indentation.

Word.docx Setting


def body(To, cc, dept, Date1, Date2, evidence):
   doc = docx.Document()
   section = doc.sections[0]
   section.top_margin = Inches(0.62)
   section.bottom_margin = Inches(0.31)
   section.left_margin = Inches(0.75)
   section.right_margin = Inches(0.81)
   normal_style = doc.styles['Normal']
   normal_style.font.name = 'Arial'
   normal_style.font.size = Pt(10)
   normal_style.font.color.rgb = RGBColor(31, 73, 125)
  • This is a preset of word document format.

  • doc = docx.Document() is Document constructor from Python-docx package.

    • Every Document objects must follow after the initial constructor.
  • section. configure fortmat of [Layout --> Margins] in Document. Doc Margins Configuration

  • normal_style. configure format of [Home --> Styles --> Normal] in Document. Doc Style Configuration

Note that body function has six different arguments
* To & cc & evidence --> defined in dictionary from {Departments} section
* dept --> defined in list from {Departments} section
* Date1 & Date2 --> already defined in {Config This Before Run} Section


    main1 = '''
To: {}

cc: {}
    '''.format(To, cc)

Email Receivers

  • main1 prints names of employees who will receive report email.
    • "To:" for head of department
    • "cc:" for relevant employees

    line1 = '''
Subject: {} Open Audit Issues / Action Plans Summary as of {}
    '''.format(dept, Date1)

Subject line

  • line1 prints subject line of email.
    • contains the name of department and date of report.

    line2 = '''

Please find the Outstanding Audit Issues/Action Plans Summary as of {} (attached).

When actions have been completed, please provide the supporting evidence to close the action. Email evidence to {}.

Should you need to revise the target completion date, please send an email to Internal Audit DH, or designee, noting the revised date, reason for the delay and interim action to mitigate risk, as applicable.

Please note: Target Date extensions will need to be approved by the respective Division Head, via email.

    '''.format(Date2, evidence)

line2

  • line2 prints body paragraph of email.
    • "evidence" for whom to send evidence of action plan. Mostly auditor who is in charge of (in this case, Ted Jung).

    parag = doc.add_paragraph(main1, 'Normal')
    parag.add_run(line1).font.color.rgb = RGBColor(0, 32, 96)
    parag.add_run(line2)
  • To add texts in Document, initial paragraph must be created by parag..
    • multiple paragraphs can exist as different groups.
  • After initial texts of paragraph, additional lines can be added by add_run.
    • each added line is subordinated to paragraph and follows paragraph's format unless defined seperately like 'line1'.

Data Prep Setting


    excel = Data_Repository[Data_Repository['DepartmentResponsible'] == dept]
    n1 = 2
  • Filter only necessary departments in the report.
  • First two rows (n1 = 2) in the table are title and column names.

    if excel['3rd Revised Target Date'].notnull().any() == True:
        excel = excel[['Issue # Ref', 'Action Status', 'Target Date', '1st Revised Target Date', '2nd Revised Target Date', '3rd Revised Target Date', 'Revised Target Date', 'Audit Name', 'Brief Description']]
    elif excel['2nd Revised Target Date'].notnull().any() == True:
        excel = excel[['Issue # Ref', 'Action Status', 'Target Date', '1st Revised Target Date', '2nd Revised Target Date', 'Revised Target Date', 'Audit Name', 'Brief Description']]
    elif excel['1st Revised Target Date'].notnull().any() == True:
        excel = excel[['Issue # Ref', 'Action Status', 'Target Date', '1st Revised Target Date', 'Revised Target Date', 'Audit Name', 'Brief Description']]
    elif excel['Revised Target Date'].notnull().any() == True:
        excel = excel[['Issue # Ref', 'Action Status', 'Target Date', 'Revised Target Date', 'Audit Name', 'Brief Description']]    
    else:
        excel = excel[['Issue # Ref', 'Action Status', 'Target Date', 'Audit Name', 'Brief Description']]
  • Default format --> 'else:'.
  • If action plan gets extended, then:
    • If extended one time --> Revised Target Date
    • If extended two times --> Revised Target Date and 1st Revised Target Date
    • If extended three times --> Revised Target Date and 1st Revised Target Date and 2nd Revised Target Date
    • If extended four times --> Revised Target Date and 1st Revised Target Date and 2nd Revised Target Date and 3rd Revised Target Date
  • Number of columns will vary on the maximum number of Target Date extended.
    • Each department has different number of extension and therefore it is necessary to set columns differently by each circumstance.

For example:
Commercial Credit_01.02.23.docx does not have any action extended its Target Date and therefore has only 5 columns.
Servicing & Loyalty_01.02.23.docx has one action extended four times, and therefore has 9 columns


    columns = excel.columns
    n2 = len(columns)
    excel = excel.fillna('')
  • Replace nan to '' for clear visualization in the table, otherwise you would see word nan in blank space.

Excel Function Setting


def make_rows_bold(*rows):
    for row in rows:
        for cell in row.cells:
            for paragraph in cell.paragraphs:
                for run in paragraph.runs:
                    run.font.bold = True
  • Function to make text of the cell in table Bold.

shading_elm_1 = parse_xml(r'<w:shd {} w:fill="#DEEAF6"/>'.format(nsdecls('w')))
  • Function to fill first row cells with certain color.

def set_table_header_bg_color(cell):
    """
    set background shading for Header Rows
    """
    tblCell = cell._tc
    tblCellProperties = tblCell.get_or_add_tcPr()
    clShading = OxmlElement('w:shd')
    clShading.set(qn('w:fill'), "#D9D9D9")
    tblCellProperties.append(clShading)
  • Function to fill second row cells with certain color.
  • Another way of coding the color.

def set_table_header_bg_color2(cell):
    """
    set background shading for Header Rows
    """
    tblCell = cell._tc
    tblCellProperties = tblCell.get_or_add_tcPr()
    clShading = OxmlElement('w:shd')
    clShading.set(qn('w:fill'), "#FFC7CE")
    tblCellProperties.append(clShading)
  • Function to fill Target Date that is overdue with pink.
  • Each color requires each function to define.

Excel Setting


    #Table Basic Format
    table = doc.add_table(rows = n1, cols = n2)
    table.allow_autofit = True
    table.alignment = WD_TABLE_ALIGNMENT.LEFT
    table.cell(0, 0).text = 'High-level Status' #First row text
    make_rows_bold(table.rows[0]) #First row Bold
    table.cell(0, 0).merge(table.rows[0].cells[-1]) #First row merge
    table.cell(0,0)._tc.get_or_add_tcPr().append(shading_elm_1) #First row color
    for col in range(len(columns)): #Column Names
        table.cell(1, col).text = columns[col]
        table.cell(1,col).paragraphs[0].paragraph_format.alignment = WD_TABLE_ALIGNMENT.CENTER
    for each_cell in table.rows[1].cells: #Second row color
        set_table_header_bg_color(each_cell)
  • Every subordiante codes must follow after 'doc.add_table'.
  • To indicate particular cell, two ways:
    • table.cell(0,0)
    • table.rows[0].cells[0]

    #Table Adding Contents
    for i, row in enumerate(excel.iloc[0:].itertuples()):
        table_row = table.add_row().cells
        for col in range(0, n2):
            table_row[col].text = str(row[col+1]) #+1 for ignore index
            table_row[col].paragraphs[0].paragraph_format.alignment = WD_TABLE_ALIGNMENT.LEFT
  • Loop creating rows by number of action plans in each department.
  • Index column is in 0 at 'excel'. Therefore, add 1 to skip Index.

    #Expired Action Highlight
    if len(excel.columns) >= 6:
        for row in range(3, len(table.rows) + 1): #3 here starts from 1
            if table.cell(row, -3).text != '': #-3 = 'Revised Target Date #If this cell has been revised
                if datetime.strptime(table.cell(row, -3).text, "%m.%d.%y") < today:
                    set_table_header_bg_color2(table.rows[row -1].cells[-3]) 
            else: #If cell has not been revised
                if datetime.strptime(table.cell(row -1, 2).text, "%m.%d.%y") < today: #2 = Target Date
                    set_table_header_bg_color2(table.rows[row - 1].cells[2])
                    
    elif len(excel.columns) == 5:
        for row in range(3, len(table.rows) + 1):
            if datetime.strptime(table.cell(row, -3).text, "%m.%d.%y") < today: #Since no Revised column, -3 = Target Date
                set_table_header_bg_color2(table.rows[row -1].cells[2])
  • If the department has at least one action plan revised before --> Number of Clolumn >= 6
  • If the department has no action plan revsied ever --> Number of Column = 5
  • Action plan should have been remediated by the latest Target Date (either revised or not). If not, then overdue action plan and needs attention.

    #Change whole table text color
    for row in table.rows:
        for cell in row.cells:
            paragraphs = cell.paragraphs
            for paragraph in paragraphs:
                for run in paragraph.runs:
                    run.font.color.rgb = RGBColor(0,0,0)
  • Text color is previously influenced by {Word.docx} font color. Therefore, re-set the color only for Table