EDGAR

A small library to access files from SEC's edgar.

Installation

pip install edgar

Example

To get a company's latest 5 10-Ks, run

from edgar import Company
company = Company("Oracle Corp", "0001341439")
tree = company.get_all_filings(filing_type = "10-K")
docs = Company.get_documents(tree, no_of_documents=5)

from edgar import Company, TXTML

company = Company("INTERNATIONAL BUSINESS MACHINES CORP", "0000051143")
doc = company.get_10K()
text = TXTML.parse_full_10K(doc)

To get all companies and find a specific one, run

from edgar import Edgar
edgar = Edgar()
possible_companies = edgar.find_company_name("Cisco System")

To get XBRL data, run

from edgar import Company, XBRL, XBRLElement

company = Company("Oracle Corp", "0001341439")
results = company.get_data_files_from_10K("EX-101.INS", isxml=True)
xbrl = XBRL(results[0])
XBRLElement(xbrl.relevant_children_parsed[15]).to_dict() // returns a dictionary of name, value, and schemaRef

API

Company

The Company class has two fields:

name (company name)
cik (company CIK number)
timeout (optional) (default: 10)

get_filings_url

Returns a url to fetch filings data

Input
- filing_type: The type of document you want. i.e. 10-K, S-8, 8-K. If not specified, it'll return all documents
- prior_to: Time prior which documents are to be retrieved. If not specified, it'll return all documents
- ownership: defaults to include. Options are include, exclude, only.
- no_of_entries: defaults to 100. Returns the number of entries to be returned. Maximum is 100.

get_all_filings

Returns the HTML in the form of lxml.html

Input
- filing_type: The type of document you want. i.e. 10-K, S-8, 8-K. If not specified, it'll return all documents
- prior_to: Time prior which documents are to be retrieved. If not specified, it'll return all documents
- ownership: defaults to include. Options are include, exclude, only.
- no_of_entries: defaults to 100. Returns the number of entries to be returned. Maximum is 100.

get_10Ks

Returns the HTML in the form of lxml.html of concatenation of all the documents in the 10-K

Input
- no_of_documents (default: 1): numer of documents to be retrieved

get_document_type_from_10K

Returns the HTML in the form of lxml.html of the document within 10-K

Input
- document_type: Tye type of document you want, i.e. 10-K, EX-3.2
- no_of_documents (default: 1): numer of documents to be retrieved

get_data_files_from_10K

Returns the HTML in the form of lxml.html of the data file within 10-K

Input
- document_type: Tye type of document you want, i.e. EX-101.INS
- no_of_documents (default: 1): numer of documents to be retrieved
- isxml (default: False): by default, things aren't case sensitive and is parsed with html in lxml. If this is True, then it is parsed with etree` which is case sensitive

get_documents (class method)

Returns a list of strings, each string contains the body of the specified document from input

Input
- tree: lxml.html form that is returned from Company.getAllFilings
- no_of_documents: number of document returned. If it is 1, the returned result is just one string, instead of a list of strings. Defaults to 1.
- debug (default: False): if True, displays the URL and form

Edgar

Gets all companies from EDGAR

get_cik_by_company_name

Input
- name: name of the company

get_company_name_by_cik

Input
- cik: cik of the company

find_company_name

Input
- words: input words to search the company

XBRL

Parses data from XBRL

relevant_children
- get children that are not context
relevant_children_parsed
- get children that are not context, unit, schemaRef
- cleans tags

JasonTam/py-edgar