CivicDataLab/zombie-tracker

Create a data mining plan for March/April, 2020

Opened this issue · 0 comments

Requirements: Fetch data for cases registered under 66A cases eCourts

Geographies

  1. Uttar Pradesh
  2. Telangana
  3. Rajasthan
  4. Maharashtra
  5. Assam
  6. Andhra Pradesh
  7. Jharkhand

Time Period: 01/01/2008 - 01/03/2020

Process

  • Identify a state for developing a methodology for case data collection
    • This will help us finalise the processes that includes mining, verification and data-validations and then scale this to other geographies
    • This will also help us come-up with better time estimates for the whole data collection exercise
    • We have selected Jharkhand as the pilot state for this purpose
  • Fetch all cases registered under The Information Technology Act, 2000.
    • This will ensure that we don't loose 66A cases that are incorrectly tagged
    • Collect patterns of the way IT act is recognised in all the district court establishments of a state
    • Collect only meta-data of cases available (without any PDF's - Orders/Judgements) from eCourts
    • A concern raised by IFF was we might miss on a few 66A cases that are not available directly under the IT act, but other acts and sections. IFF will do a preliminary analysis on this, before we include other acts in for fetching case records
  • Filter out 66A cases from this set of cases
    • This is more of a regular expression matching exercise, where we find patterns of 66A mentioned as part of the case
  • Setup a data-validation and verification pipeline to verify if the cases are correctly tagged as 66A
  • Model data as per the research requirements