get_earnings_for_date() Always Fails Now
WesNeu opened this issue · 6 comments
My daily script suddenly started failing yesterday in here:
def get_earnings_for_date(date, offset = 0, count = 1):
. . .
stores = result['context']['dispatcher']['stores']
it appears stores is expected to be a dict, but instead it's coming back as a string.
That causes the next line which tries to use stores as a dict to fail:
earnings_count` = stores['ScreenerCriteriaStore']['meta']['total']
so I'm guessing Yahoo has changed the underlying data format . . .
Yes, it's the same root cause as this: #94
Yahoo made a change and is returning an encrypted string for the "stores" value now instead of a plain text dictionary.
Here's the workaround I'm using successfully, which is based on the fix yfinance used for the same issue. (ranaroussi/yfinance#1253)
1. Edit the stock_info.py file from yahoo_fin (...\Lib\site-packages\yahoo_fin\stock_info.py)
2. Add the following imports and install them in your project with pip if necessary
import hashlib
from base64 import b64decode
from cryptography.hazmat.primitives import padding
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
3. Edit the get_earnings_for_date() function
change this line: stores = result['context']['dispatcher']['stores']
to this: stores = decrypt_cryptojs_aes(result)
4. Add the following function:
def decrypt_cryptojs_aes(data):
encrypted_stores = data['context']['dispatcher']['stores']
_cs = data["_cs"]
_cr = data["_cr"]
_cr = b"".join(int.to_bytes(i, length=4, byteorder="big", signed=True) for i in json.loads(_cr)["words"])
password = hashlib.pbkdf2_hmac("sha1", _cs.encode("utf8"), _cr, 1, dklen=32).hex()
encrypted_stores = b64decode(encrypted_stores)
assert encrypted_stores[0:8] == b"Salted__"
salt = encrypted_stores[8:16]
encrypted_stores = encrypted_stores[16:]
def EVPKDF(password, salt, keySize=32, ivSize=16, iterations=1, hashAlgorithm="md5") -> tuple:
"""OpenSSL EVP Key Derivation Function
Args:
password (Union[str, bytes, bytearray]): Password to generate key from.
salt (Union[bytes, bytearray]): Salt to use.
keySize (int, optional): Output key length in bytes. Defaults to 32.
ivSize (int, optional): Output Initialization Vector (IV) length in bytes. Defaults to 16.
iterations (int, optional): Number of iterations to perform. Defaults to 1.
hashAlgorithm (str, optional): Hash algorithm to use for the KDF. Defaults to 'md5'.
Returns:
key, iv: Derived key and Initialization Vector (IV) bytes.
Taken from: https://gist.github.com/rafiibrahim8/0cd0f8c46896cafef6486cb1a50a16d3
OpenSSL original code: https://github.com/openssl/openssl/blob/master/crypto/evp/evp_key.c#L78
"""
assert iterations > 0, "Iterations can not be less than 1."
if isinstance(password, str):
password = password.encode("utf-8")
final_length = keySize + ivSize
key_iv = b""
block = None
while len(key_iv) < final_length:
hasher = hashlib.new(hashAlgorithm)
if block:
hasher.update(block)
hasher.update(password)
hasher.update(salt)
block = hasher.digest()
for _ in range(1, iterations):
block = hashlib.new(hashAlgorithm, block).digest()
key_iv += block
key, iv = key_iv[:keySize], key_iv[keySize:final_length]
return key, iv
key, iv = EVPKDF(password, salt, keySize=32, ivSize=16, iterations=1, hashAlgorithm="md5")
cipher = Cipher(algorithms.AES(key), modes.CBC(iv))
decryptor = cipher.decryptor()
plaintext = decryptor.update(encrypted_stores) + decryptor.finalize()
unpadder = padding.PKCS7(128).unpadder()
plaintext = unpadder.update(plaintext) + unpadder.finalize()
plaintext = plaintext.decode("utf-8")
decoded_stores = json.loads(plaintext)
return decoded_stores
Thank you for the workaround solution.
It worked great until this past Friday.
Since then I get
KeyError: '_cs'
Any idea how to fix that?
Yep, yahoo made another change to encryption. Best to find an alternate project if you can (yfinance etc.). I need the get_earnings_for_date() function which is not available on yfinance so I'm stuck for now. Fortunately the yfinance project seems to be keeping up with the encryption changes so you can always go there to get the latest workaround: https://github.com/ranaroussi/yfinance/blob/main/yfinance/data.py
Below is the updated decryption function I'm using. Note they changed the function name from "decrypt_cryptojs_aes" to "decrypt_cryptojs_aes_stores" so you'll need to update your call. Also need change your imports section.
imports
import requests
import pandas as pd
import ftplib
import io
import re
import json
import datetime
import hashlib
from base64 import b64decode
usePycryptodome = False # slightly faster
usePycryptodome = True
if usePycryptodome:
from Crypto.Cipher import AES
from Crypto.Util.Padding import unpad
else:
from cryptography.hazmat.primitives import padding
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
try:
from requests_html import HTMLSession
except Exception:
print("""Warning - Certain functionality
requires requests_html, which is not installed.
Install using:
pip install requests_html
After installation, you may have to restart your Python session.""")
updated decryption function
def decrypt_cryptojs_aes_stores(data):
encrypted_stores = data['context']['dispatcher']['stores']
if "_cs" in data and "_cr" in data:
_cs = data["_cs"]
_cr = data["_cr"]
_cr = b"".join(int.to_bytes(i, length=4, byteorder="big", signed=True) for i in json.loads(_cr)["words"])
password = hashlib.pbkdf2_hmac("sha1", _cs.encode("utf8"), _cr, 1, dklen=32).hex()
else:
# Currently assume one extra key in dict, which is password. Print error if
# more extra keys detected.
new_keys = [k for k in data.keys() if k not in ["context", "plugins"]]
l = len(new_keys)
if l == 0:
return None
elif l == 1 and isinstance(data[new_keys[0]], str):
password_key = new_keys[0]
else:
msg = "Yahoo has again changed data format, yfinance now unsure which key(s) is for decryption:"
k = new_keys[0]
k_str = k if len(k) < 32 else k[:32-3]+"..."
msg += f" '{k_str}'->{type(data[k])}"
for i in range(1, len(new_keys)):
msg += f" , '{k_str}'->{type(data[k])}"
raise Exception(msg)
password_key = new_keys[0]
password = data[password_key]
encrypted_stores = b64decode(encrypted_stores)
assert encrypted_stores[0:8] == b"Salted__"
salt = encrypted_stores[8:16]
encrypted_stores = encrypted_stores[16:]
def EVPKDF(password, salt, keySize=32, ivSize=16, iterations=1, hashAlgorithm="md5") -> tuple:
"""OpenSSL EVP Key Derivation Function
Args:
password (Union[str, bytes, bytearray]): Password to generate key from.
salt (Union[bytes, bytearray]): Salt to use.
keySize (int, optional): Output key length in bytes. Defaults to 32.
ivSize (int, optional): Output Initialization Vector (IV) length in bytes. Defaults to 16.
iterations (int, optional): Number of iterations to perform. Defaults to 1.
hashAlgorithm (str, optional): Hash algorithm to use for the KDF. Defaults to 'md5'.
Returns:
key, iv: Derived key and Initialization Vector (IV) bytes.
Taken from: https://gist.github.com/rafiibrahim8/0cd0f8c46896cafef6486cb1a50a16d3
OpenSSL original code: https://github.com/openssl/openssl/blob/master/crypto/evp/evp_key.c#L78
"""
assert iterations > 0, "Iterations can not be less than 1."
if isinstance(password, str):
password = password.encode("utf-8")
final_length = keySize + ivSize
key_iv = b""
block = None
while len(key_iv) < final_length:
hasher = hashlib.new(hashAlgorithm)
if block:
hasher.update(block)
hasher.update(password)
hasher.update(salt)
block = hasher.digest()
for _ in range(1, iterations):
block = hashlib.new(hashAlgorithm, block).digest()
key_iv += block
key, iv = key_iv[:keySize], key_iv[keySize:final_length]
return key, iv
try:
key, iv = EVPKDF(password, salt, keySize=32, ivSize=16, iterations=1, hashAlgorithm="md5")
except:
raise Exception("yfinance failed to decrypt Yahoo data response")
if usePycryptodome:
cipher = AES.new(key, AES.MODE_CBC, iv=iv)
plaintext = cipher.decrypt(encrypted_stores)
plaintext = unpad(plaintext, 16, style="pkcs7")
else:
cipher = Cipher(algorithms.AES(key), modes.CBC(iv))
decryptor = cipher.decryptor()
plaintext = decryptor.update(encrypted_stores) + decryptor.finalize()
unpadder = padding.PKCS7(128).unpadder()
plaintext = unpadder.update(plaintext) + unpadder.finalize()
plaintext = plaintext.decode("utf-8")
decoded_stores = json.loads(plaintext)
return decoded_stores
When I looked at yfinance, it seemed that the Earnings data was on a per symbol basis, is that correct?
Just as a side note, I see that alphavantage.co API has an earnings API call, which works and is free, however I'm still keen to continue using yahoo_fin or yfinance if possible, as I don't know where the Alphavantage data is sourced, or how reliable it is. I use it to get earnings data, such as EPS and that has been quite good so far.
Thanks for updating us here, I wonder if it's worth submitting a Pull Request?
"When I looked at yfinance, it seemed that the Earnings data was on a per symbol basis, is that correct?"
Yes, I think so.
Thanks for the tip about the alphavantage.co API -- I will eventually find time to investigate it.
@sonso-1
Thanks again for the solution that worked for a few more weeks.
"I need the get_earnings_for_date() function which is not available on yfinance so I'm stuck for now."
Yes, I'm in the same boat.
"you can always go there to get the latest workaround: https://github.com/ranaroussi/yfinance/blob/main/yfinance/data.py"
I just tried and it doesn't seem to work. Did you get it working?
If so, are you willing to share a link to your updated version of stock_info.py?
Thanks!
It's not working for me either. I'm getting "Exception: yfinance failed to decrypt Yahoo data response"
I think they will fix this eventually over at yfinance. If you need earnings by date specifically, there is a pull request over there for this (ranaroussi/yfinance#1316) although I don't think it's been merged in yet, not sure.
Also might want to take a look here: https://github.com/ranaroussi/yfinance/compare/1019efda61ad87b8183c2e26bd80f85035e0010f..0c037ddd128f3ce5dee79ceb8a8571e5000fcd30
I copied/pasted the "get_earnings_by_date" and "_get_earnings_by_date" functions into my project and modified to fit my needs. Not pretty but it gets the job done for now.