TED (Tenders Electronic Daily) is the online version of the 'Supplement to the Official Journal' of the EU, dedicated to European public procurement.
TED provides free access to public procurement notices from contracting authorities based in the European Union and in the European Economic Area (also if they provide services in any other country).
TED website currently offers bulk downloads of XML packages dating back to 1993, which can be found on the FTP ftp://ted.europa.eu/ accessible with generic credentials (guest/guest).
This repo contains scrips to extract and parse TED Contract award notices, i.e. the results of the procurement procedure. The scripts have been fully tested to extract all Contract award notices for 2014, 2015 and 2016. Documents are extracted as Python dictionaries which can be saved in a MongoDB database for easy retrieval, or further normalised to be converted in CSVs.
The image below show an example of document as shown on the TED website.
An example of extracted data (from same document above):
{
'DOC_ID': '466898-2016',
'CODED_DATA': {
'NOTICE_DATA': {
'IA_URL_GENERAL': 'www.ekz.ch',
'ISO_COUNTRY': 'CH',
'NO_DOC_OJS': '2016/S 253-466898',
'ORIGINAL_CPV': ['79970000'],
'REF_NOTICE': ['2016/S 172-310484']
},
'CODIF_DATA': {
'AA_AUTHORITY_TYPE': '4',
'AC_AWARD_CRIT': 'Z',
'DS_DATE_DISPATCH': '20161230',
'MA_MAIN_ACTIVITIES': ['Z'],
'NC_CONTRACT_NATURE': '4',
'PR_PROC': '2',
'RP_REGULATION': '7',
'TD_DOCUMENT_TYPE': '7',
'TY_TYPE_BID': '9'
}
},
'CONTRACT': {
'OTH_NOT': 'NO',
'CONTRACTING_AUTHORITY': 'Elektrizitätswerke des Kantons Zürich',
'CONTRACT_OBJECT': {
'CONCLUSION_FRAMEWORK_AGREEMENT': 'NO',
'CONTRACTS_DPS': 'NO',
'CONTRACT_COVERED_GPA': 'YES',
'CPV_MAIN': '79970000'
}
},
'AWARDS_OF_CONTRACT': [
{
'CONTRACTOR': {
'ADDRESS': 'Mühlebachstraße 52',
'COUNTRY': 'CH',
'OFFICIALNAME': 'Linkgroup AG',
'POSTAL_CODE': '8008',
'TOWN': 'Zürich'
},
'CONTRACT_VALUE': {
'COST': {
'CURRENCY': 'CHF',
'VALUE': 700000.0
}
}
}
]
}
Fields and description of the extracted data
Field | Data type | Description |
---|---|---|
DOC_ID | String | Unique document number in TED. |
The CODED_DATA section is divided in 2 groups of data.
- NOTICE_DATA contains data related to the notice
- CODIF_DATA contains the META data (codification) related to the notice.
Field | Data type | Description |
---|---|---|
NO_DOC_OJS | String | Notice number in TED |
ORIGINAL_NUTS | List(String) | Region code(s) of the place of performance or delivery. A 2-5 digits code of the Nomenclature of Territorial Units for Statistics. Lookup values for first two digits: ISO_COUNTRY.csv |
ORIGINAL_CPV | List(String) | Product or service 8 digits code(s) of the Common Procurement Vocabulary. Lookup values for first two digits: CPV.csv |
ISO_COUNTRY | String | 2-characters ISO code of the country where the contracting body is located. Lookup values: ISO_COUNTRY.csv |
IA_URL_GENERAL | String | Main internet address (URL) of the contracting body |
REF_NOTICE | List(String) | Reference notice number in TED. Referencing a previous publication (prior information, corrigendum, ...) |
VALUES_LIST | Total Value | Estimated total value(s) or total final value of the procurement |
VALUES_LIST can be composed of the following fields:
Field | Data type | Description |
---|---|---|
GLOBAL_VALUE | Value | The total value of the tender. Should equal the sum of in CONTRACTS_VALUE List |
CONTRACTS_VALUE | List(Value) | Individual awards value. This happens when the contract is composed of multiple parts or lots. |
Each value can be single or range value and can have VAT percentage
- When is single:
Field | Data type | Description |
---|---|---|
CURRENCY | String | Currency of the value |
VALUE | Float | Value |
VAT_PRCT | Int | Vat percentage |
- When is range:
Field | Data type | Description |
---|---|---|
CURRENCY | String | Currency of the value |
LOW_VALUE | Float | Lower value of the range |
HIGH_VALUE | Float | Higher value of the range |
VAT_PRCT | Int | Vat percentage |
Field | Data type | Description |
---|---|---|
DS_DATE_DISPATCH | String | Date of dispatch of the notice. Format: yyyymmdd |
TD_DOCUMENT_TYPE | String | Type of document. Lookup values: TD_DOCUMENT_TYPE.csv |
AA_AUTHORITY_TYPE | String | Type of awarding authority. Lookup values: AA_AUTHORITY_TYPE.csv |
NC_CONTRACT_NATURE | String | Nature of contract. Lookup values: NC_CONTRACT_NATURE.csv |
PR_PROC | String | Type of procedure. Lookup values: PR_PROC.csv |
RP_REGULATION | String | The regulation that applies to the procedure. Lookup values: RP_REGULATION.csv |
TY_TYPE_BID | String | Type of bid. Lookup values: TY_TYPE_BID.csv |
AC_AWARD_CRIT | String | Type of awarding criteria. Lookup values: AC_AWARD_CRIT.csv |
MA_MAIN_ACTIVITIES | List(String) | Main activity of the contracting body. Lookup values: MA_MAIN_ACTIVITIES.csv |
The CONTRACT section contains the notice itself, in the XML format. In the original XML file this section can be available in different translations, up to 24 when is fully translated. If available, the English translation of the contract is preferred. Otherwise, French and German becomes second and third choice respectively. When any of this translation is not present, the first available is picked.
Field | Data type | Description |
---|---|---|
OTH_NOT | String | Indicate whether the notice follow a standard structure (then OTH_NOT = NO) or structure is open to allow the publication of any other notice which is not following a standard form, aka non-structured notice (then OTH_NOT = YES). Only when OTH_NOT = NO, the fields CONTRACTING_AUTHORITY, CONTRACT_OBJECT and AWARDS_OF_CONTRACT are extracted |
Field | Data type | Description |
---|---|---|
CONTRACTING_AUTHORITY | String | Name of Contracting Authority |
Object of the contract.
Field | Data type | Description |
---|---|---|
NUTS | List(String) | Region code(s) of the place of performance or delivery. A 2-5 digits code of the Nomenclature of Territorial Units for Statistics. Lookup values for first two digits: ISO_COUNTRY.csv |
NUTS_EXTRA | String | Additional extracted comments on the place of performance or delivery |
CPV_MAIN | String | Main Product or service 8 digits code of the Common Procurement Vocabulary. Lookup values for first two digits: CPV.csv |
CONTRACT_COVERED_GPA | String | YES or NO if contract is covered by GPA (Government Procurement Agreement) |
CONCLUSION_FRAMEWORK_AGREEMENT | String | YES or NO if contract is part of a Framework Agreement |
CONTRACTS_DPS | String | YES or NO if contract is subject to a Dynamic Purchasing System |
CONTRACT_VALUE | Contract Value | Total value. Should correspond to the sum of individual contractor's contract values. See Section AWARDS_OF_CONTRACT |
Each contract value can be composed of the following fields:
Field | Data type | Description |
---|---|---|
COST | Value | The actual cost |
ESTIMATE | Value | An initial estimate of the cost |
NUMBER_OF_YEARS | Int | Number of year the contract last. Cost/Estimate might be given per year |
NUMBER_OF_MONTHS | Int | Number of month the contract last. Cost/Estimate might be given per month |
Fields are same as mentioned above. See Section Value Data Type
List of awards.
Each award can contain the following sub-sections:
- CONTRACTOR: General Information about successful bidder
- CONTRACT_VALUE: Contract value associated with contractor.
Field | Data type | Description |
---|---|---|
OFFICIALNAME | String | Name of Contractor |
COUNTRY | String | Country of Contractor |
ADDRESS | String | Address (Street Name) of Contractor |
TOWN | String | Town of Contractor |
POSTAL_CODE | String | Postal Code of Contractor |
Fields are same as mentioned above. See Section 3.3.1 CONTRACT_VALUE
Python modules required:
- lxml
- voluptuous
- pandas
Optional:
- pymongo
Python version: 3.6 Database: MongoDB, local mode
Example of how to convert a document
file_path = '454322_2015.xml' # Contract Award notice downloaded from tED website
# Extract the raw data
from extractor import extract
raw = extract(file_path)
# Validate the raw data
from validator import validate
data = validate(raw)
# Prune the data: remove empty fields
from validator import prune
prune(data)
Repository structure:
.
├── Lookups
│ └── AA_AUTHORITY_TYPE.csv
│ └── AC_AWARD_CRIT.csv
│ └── CPV.csv
│ └── ISO_COUNTRY.csv
│ └── MA_MAIN_ACTIVITY.csv
│ └── NC_CONTRACT_NATURE.csv
│ └── PR_PROC.csv
│ └── RP_REGULATION.csv
│ └── TD_DOCUMENT_TYPE.csv
│ └── TY_TYPE_BID.csv
├── extractor.py
└── validator.py
└── mongo_import.py
Lookups
: folder containing various lookup filesextractor.py
: script to extract raw data from the contract award noticesvalidator.py
: scripts to validate the raw data and prune the dictionary (i.e. remove the empty fileds)mongo_import.py
: script to upload the data in a MongoDB database
Alessandra Sozzi, working for the Office for National Statistics Big Data project
Released under the MIT License.