yfR: Downloads and Organizes Financial Data from Yahoo Finance

Question

yfR: Downloads and Organizes Financial Data from Yahoo Finance

Closed this issue 2 years ago · 79 comments

Date accepted: 2022-06-21

Submitting Author Name: Marcelo Perlin
Submitting Author Github Handle: @msperlin
Other Package Authors Github handles: (comma separated, delete if none)
Repository: https://github.com/msperlin/yfR
Version submitted: 0.0.1
Submission type: Standard
Editor: @melvidoni
Reviewers: @s3alfisc, @thisisnic

Due date for @s3alfisc: 2022-05-29

Due date for @thisisnic: 2022-06-13

Archive: TBD
Version accepted: TBD
Language: en

Paste the full DESCRIPTION file inside a code block below:

Package: yfR
Title: Downloads and Organizes Financial Data from Yahoo Finance
Version: 0.0.1
Authors@R: person("Marcelo", "Perlin", email = "marceloperlin@gmail.com", role = c("aut", "cre"))
Description: Facilitates download of financial data from Yahoo Finance <https://finance.yahoo.com/>, 
 a vast repository of stock price data across multiple financial exchanges. The package offers a local caching system
 and support for parallel computation.
URL: https://github.com/msperlin/yfR
BugReports: https://github.com/msperlin/yfR/issues
Depends:
    R (>= 4.1)
Imports: stringr, curl, tidyr, 
    lubridate, furrr, purrr, future, tibble, zoo,
    cli, readr, rvest, dplyr, quantmod
License: MIT + file LICENSE
LazyData: true
RoxygenNote: 7.1.2
Suggests: 
    knitr,
    rmarkdown,
    testthat (>= 3.0.0),
    ggplot2,
    covr
VignetteBuilder: knitr
Config/testthat/edition: 3

Scope

Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
- data retrieval
- data extraction
- data munging
- data deposition
- workflow automation
- version control
- citation management and bibliometrics
- scientific software wrappers
- field and lab reproducibility tools
- database software bindings
- geospatial data
- text analysis
Explain how and why the package falls under these categories (briefly, 1-2 sentences):

Package yfR retrieves and organizes data from Yahoo Finance, a large repository for stock price data.

Who is the target audience and what are scientific applications of this package?

Target audience are students, researchers and industry practioneers in the field of Finance and Economics.

Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?

Package yfR is the second and backwards-incompatible version of BatchGetSymbols, also developed by me. My plan is to first deprecate BatchGetSymbols and later remove it from CRAN and archive it in Github.

Moreover, there are other packages, such as quantmod, that downloads data from Yahoo Finance, but none with similar features to yfR and BatchGetSymbols.

(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?

Yes.

If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
Explain reasons for any pkgcheck items which your package is unable to pass.

Unfortinately, I was not able to run pkgcheck locally as I was unable to install (or make) dependency ctags in my Linux Mint 20.3 machine. Nonetheless, I read through and followed all guidelines available in the manual.

Technical checks

Confirm each of the following by checking the box.

I have read the guide for authors and rOpenSci packaging guide.

This package:

does not violate the Terms of Service of any service it interacts with.
has a CRAN and OSI accepted license.
contains a README with instructions for installing the development version.
includes documentation with examples for all functions, created with roxygen2.
contains a vignette with examples of its essential functions and uses.
has a test suite.
has continuous integration, including reporting of test coverage using services such as Travis CI, Coveralls and/or CodeCov.

Publication options

Do you intend for this package to go on CRAN?
Do you intend for this package to go on Bioconductor?
Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options

The package is novel and will be of interest to the broad readership of the journal.
The manuscript describing the package is no longer than 3000 words.
You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
(Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
(Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
(Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Answer 1 · 2022-03-30T19:13:18.000Z

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

Answer 2 · 2022-03-30T19:13:21.000Z

🚀

The following problem was found in your submission template:

'author1' variable must be GitHub hanle only ('@myhandle')
Editors: Please ensure these problems with the submission template are rectified. Package checks have been started regardless.

👋

Answer 3 · 2022-03-30T19:26:33.000Z

Oops, something went wrong with our automatic package checks. Our developers have been notified and package checks will appear here as soon as we've resolved the issue. Sorry for any inconvenience.

Answer 4 · 2022-03-31T09:29:24.000Z

Checks for yfR (v0.0.1)

git hash: c345549c

✔️ Package name is available
✔️ has a 'codemeta.json' file.
✔️ has a 'contributing' file.
✔️ uses 'roxygen2'.
✔️ 'DESCRIPTION' has a URL field.
✔️ 'DESCRIPTION' has a BugReports field.
✔️ Package has at least one HTML vignette
✔️ All functions have examples.
✖️ Package has no continuous integration checks.
✔️ Package coverage is 87.8%.
✔️ R CMD check found no errors.
✔️ R CMD check found no warnings.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE

1. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in R (100% in 8 files) and
1 authors
1 vignette
no internal data file
14 imported packages
6 exported functions (median 16 lines of code)
34 non-exported functions in R (median 12 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	8	50.7
files_vignettes	3	92.4
files_tests	5	81.7
loc_R	779	61.1
loc_vignettes	160	41.2
loc_tests	184	53.0
num_vignettes	1	64.8
n_fns_r	40	49.3
n_fns_r_exported	6	29.1
n_fns_r_not_exported	34	56.6
n_fns_per_file_r	3	45.9
num_params_per_fn	2	11.9
loc_per_fn_r	14	45.4
loc_per_fn_r_exp	16	38.0
loc_per_fn_r_not_exp	12	42.0
rel_whitespace_R	29	73.7
rel_whitespace_vignettes	65	65.5
rel_whitespace_tests	56	72.7
doclines_per_fn_exp	20	13.8
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	37	59.9

1a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

2. `goodpractice` and other checks

Details of goodpractice and other checks (click to open)

3b. `goodpractice` results

`R CMD check` with rcmdcheck

rcmdcheck found no errors, warnings, or notes

Test coverage with covr

Package coverage: 87.78

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function	cyclocomplexity
yf_get	23
yf_get_single_ticker	22

Static code analyses with lintr

lintr found the following 2 potential issues:

message	number of times
Avoid library() and require() calls in packages	2

Package Versions

package	version
pkgstats	0.0.3.96
pkgcheck	0.0.2.276

Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

Answer 5 · 2022-03-31T09:36:17.000Z

@jooolia The faling check is just because the README does not have a CI badge. @msperlin Could you please add an R CMD check badge to your readme? (We check for CI via badges rather than workflow results, because we do accept submissions from arbitrary code-hosting platforms, not just GitHub.) Thanks!

Answer 6 · 2022-03-31T11:02:54.000Z

Good morning.

Sure, I just added the R-CMD badge.

Answer 7 · 2022-04-01T12:03:28.000Z

@ropensci-review-bot check package

Answer 8 · 2022-04-01T12:03:29.000Z

Thanks, about to send the query.

Answer 9 · 2022-04-01T12:03:32.000Z

🚀

Editor check started

👋

Answer 10 · 2022-04-01T12:15:28.000Z

Oops, something went wrong with our automatic package checks. Our developers have been notified and package checks will appear here as soon as we've resolved the issue. Sorry for any inconvenience.

Answer 11 · 2022-04-01T13:31:05.000Z

Checks for yfR (v0.0.1)

git hash: 1ee2f6f5

✔️ Package name is available
✔️ has a 'codemeta.json' file.
✔️ has a 'contributing' file.
✔️ uses 'roxygen2'.
✔️ 'DESCRIPTION' has a URL field.
✔️ 'DESCRIPTION' has a BugReports field.
✔️ Package has at least one HTML vignette
✔️ All functions have examples.
✔️ Package has continuous integration checks.
✔️ Package coverage is 87.8%.
✔️ R CMD check found no errors.
✔️ R CMD check found no warnings.

Package License: MIT + file LICENSE

1. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type	package	ncalls
internal	base	69
internal	yfR	17
internal	utils	3
imports	dplyr	11
imports	purrr	5
imports	readr	5
imports	stringr	4
imports	rvest	3
imports	tidyr	2
imports	lubridate	2
imports	furrr	2
imports	future	2
imports	tibble	1
imports	zoo	1
imports	quantmod	1
imports	curl	NA
imports	cli	NA
suggests	knitr	NA
suggests	rmarkdown	NA
suggests	testthat	NA
suggests	ggplot2	NA
suggests	covr	NA
linking_to	NA	NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

c (6), file.path (5), as.Date (4), min (4), paste0 (4), data.frame (3), file.exists (3), length (3), list (3), seq (3), as.character (2), as.numeric (2), for (2), max (2), names (2), options (2), rep (2), switch (2), tempdir (2), as.POSIXct (1), class (1), file (1), is.na (1), lapply (1), list.files (1), order (1), seq_along (1), setdiff (1), sum (1), Sys.Date (1), Sys.getenv (1), which (1)

yfR

fix_ticker_name (2), get_morale_boost (2), set_cli_msg (2), yf_get_available_indices (2), calc_ret (1), date_to_unix (1), fct_format_wide (1), unix_to_date (1), yf_get (1), yf_get_available_collections (1), yf_get_ibov_stocks (1), yf_get_index_comp (1), yf_get_single_ticker (1)

dplyr

first (3), bind_rows (2), tibble (2), filter (1), lag (1), mutate (1), rename (1)

purrr

map (2), map_chr (2), pmap (1)

readr

read_rds (4), write_rds (1)

stringr

fixed (1), str_c (1), str_detect (1), str_split (1)

rvest

html_nodes (2), html_table (1)

utils

data (2), capture.output (1)

furrr

furrr_options (2)

future

availableCores (1), plan (1)

lubridate

wday (2)

tidyr

all_of (1), pivot_wider (1)

quantmod

getSymbols (1)

tibble

tibble (1)

zoo

index (1)

2. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

code in R (100% in 8 files) and
1 authors
1 vignette
no internal data file
14 imported packages
6 exported functions (median 16 lines of code)
34 non-exported functions in R (median 12 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

loc = "Lines of Code"
fn = "function"
exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure	value	percentile	noteworthy
files_R	8	50.7
files_vignettes	3	92.4
files_tests	5	81.7
loc_R	779	61.1
loc_vignettes	160	41.2
loc_tests	184	53.0
num_vignettes	1	64.8
n_fns_r	40	49.3
n_fns_r_exported	6	29.1
n_fns_r_not_exported	34	56.6
n_fns_per_file_r	3	45.9
num_params_per_fn	2	11.9
loc_per_fn_r	14	45.4
loc_per_fn_r_exp	16	38.0
loc_per_fn_r_not_exp	12	42.0
rel_whitespace_R	29	73.7
rel_whitespace_vignettes	65	65.5
rel_whitespace_tests	56	72.7
doclines_per_fn_exp	20	13.8
doclines_per_fn_not_exp	0	0.0	TRUE
fn_call_network_size	37	59.9

2a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package

3. `goodpractice` and other checks

Details of goodpractice and other checks (click to open)

3a. Continuous Integration Badges

GitHub Workflow Results

name	conclusion	sha	date
pages build and deployment	success	1ee2f6	2022-03-31
pkgdown	success	51af0f	2022-03-30
R-CMD-check	success	1ee2f6	2022-03-31
render-rmarkdown	failure	f3dbe5	2022-03-30
test-coverage	success	1ee2f6	2022-03-31

3b. `goodpractice` results

`R CMD check` with rcmdcheck

rcmdcheck found no errors, warnings, or notes

Test coverage with covr

Package coverage: 87.78

Cyclocomplexity with cyclocomp

The following functions have cyclocomplexity >= 15:

function	cyclocomplexity
yf_get	23
yf_get_single_ticker	22

Static code analyses with lintr

lintr found the following 2 potential issues:

message	number of times
Avoid library() and require() calls in packages	2

Package Versions

package	version
pkgstats	0.0.4.4
pkgcheck	0.0.3.6

Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

Answer 12 · 2022-04-01T20:06:19.000Z

Dear @msperlin,
Thank you for your submission. The package has passed all of the automated package checks and the test coverage is good.
Could you expand a bit more on how this package differs from quantmod and tidyquant?
Thanks, Julia

Answer 13 · 2022-04-02T11:05:03.000Z

Good morning Julia,

The main goal of yfR is to help user download large ammounts of data from Yahoo Finance (YF).

Packages quantmod and tidyquant also offers a function for downloading price data from YF, but only that. Besides importing data, yfR offers the following functionalities:

Organization and clean up of data
- Users can set a threshold for what is "bad" data with respect to matching dates to a benchmark dataset (SP500 is usually used);
- Users can also ask for "complete data", where all missing dates are set as NA for later substitution;
- Log or arithmetic returns, much used in research, are also calculated by default;
- User can aggregate the data to weekly, monthly or yearly, always keeping the same data structure.
smarter downloads
- A local (and smart) session-persistent caching system is implemented. This means that, within a session, the data is never downloaded twice and only missing portions of data are downloaded;
- Support for parallel computing. Users can easily set up concurrent R sessions for faster download of data.
Practicality
- yfR innovates with a "collection" system, where one can easily import a collection of tickers such as the SP500 composition in a single function call.

Answer 14 · 2022-04-03T19:03:29.000Z

Thank you @msperlin, I am discussing with the other editors and will get back to you. Thanks, Julia

Answer 15 · 2022-04-29T08:26:14.000Z

Thanks for your patience @msperlin. The fit seems to be good for us and I am now looking for a handling editor.
Thanks, Julia

Answer 16 · 2022-04-29T10:57:45.000Z

Great, thanks @jooolia.

Answer 17 · 2022-05-01T20:31:48.000Z

@ropensci-review-bot assign @melvidoni as editor

Answer 18 · 2022-05-01T20:31:50.000Z

Assigned! @melvidoni is now the editor

Answer 19 · 2022-05-01T20:54:31.000Z

@ropensci-review-bot seeking reviewers

Answer 20 · 2022-05-01T20:54:31.000Z

Please add this badge to the README of your package repository:

[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/523_status.svg)](https://github.com/ropensci/software-review/issues/523)

Furthermore, if your package does not have a NEWS.md file yet, please create one to capture the changes made during the review process. See https://devguide.ropensci.org/releasing.html#news

Answer 21 · 2022-05-02T11:02:45.000Z

Thanks. The badge is added in dc712f4abac246604721ed7f2926f9794e4e7f99 and the news file already exists.

Answer 22 · 2022-05-04T17:05:39.000Z

Hi @melvidoni !
I would like to review this package

Answer 23 · 2022-05-04T21:32:42.000Z

Hi @melvidoni ! I would like to review this package

Hello @Athene-ai, of course, this package is still needing reviewers. I saw you wrote on several packages, so be mindful that asking in multiple places may not be ideal, as you may end up with more workload than intended. The review timeframe for this is 3 weeks, so if that's okay with you, I'll assign you to this package (and you'll have to complete this review first before accepting any others).

Answer 24 · 2022-05-05T06:26:27.000Z

@melvidoni I accept the invitation to review this package within three weeks

Answer 25 · 2022-05-05T18:58:04.000Z

@ropensci-review-bot assign @Athene-ai as reviewer

Answer 26 · 2022-05-05T18:58:07.000Z

@Athene-ai added to the reviewers list. Review due date is 2022-05-26. Thanks @Athene-ai for accepting to review! Please refer to our reviewer guide.

Answer 27 · 2022-05-05T18:58:09.000Z

@Athene-ai: If you haven't done so, please fill this form for us to update our reviewers records.

Answer 28 · 2022-05-05T19:12:47.000Z

@melvidoni thanks for adding me as reviewer and I filled the volunteer form for being an rOpenSci Reviewer :-)

Answer 29 · 2022-05-06T14:04:00.000Z

@melvidoni do we have a slack channel?

Answer 30 · 2022-05-08T20:12:26.000Z

@ropensci-review-bot assign @s3alfisc as reviewer

Answer 31 · 2022-05-08T20:12:28.000Z

@s3alfisc added to the reviewers list. Review due date is 2022-05-29. Thanks @s3alfisc for accepting to review! Please refer to our reviewer guide.

Answer 32 · 2022-05-08T20:12:29.000Z

@s3alfisc: If you haven't done so, please fill this form for us to update our reviewers records.

Answer 33 · 2022-05-09T21:15:05.000Z

@melvidoni do we have a slack channel?

Hello @Athene-ai. Please, be mindful that responses are not immediate, especially over the weekend; kindly do not hasten people, and wait for responses/actions. There is much going on "behind the scenes" that you may not be aware of.

That said, you'll get an invitation to the Slack later in the process.

Answer 34 · 2022-05-10T05:58:19.000Z

@melvidoni do we have a slack channel?

Hello @Athene-ai. Please, be mindful that responses are not immediate, especially over the weekend; kindly do not hasten people, and wait for responses/actions. There is much going on "behind the scenes" that you may not be aware of.

That said, you'll get an invitation to the Slack later in the process.

Thanks for the information 😊

Answer 35 · 2022-05-20T10:10:47.000Z

@Athene-ai Could you please paste a completed review here? Rather than adding more comments to this issue, you may leave that template there for now, and update it with an actual review when you've got that far. It's best to complete the template offline, edit the issue to delete all current content, and then simply paste the completed review back in place of the above comment. Thanks.

Answer 36 · 2022-05-23T02:22:10.000Z

@ropensci-review-bot remove @Athene-ai from reviewers

Answer 37 · 2022-05-23T02:22:13.000Z

@Athene-ai removed from the reviewers list!

Answer 38 · 2022-05-23T02:23:48.000Z

@msperlin we apologise for the issues caused with the prior reviewer. It has now been removed from the list of reviewers, and I will proceed to search for another reviewer. Please understand that although we try to give everyone an opportunity, sometimes it is not possible to foresee how will they take the opportunity.

I will strive to get a new reviewer, but the person will be given 3 weeks from the acceptance date, hence some delays are bound to happen.

Edit: wrong punctuation, apologies.

Answer 39 · 2022-05-23T11:00:58.000Z

Good morning @melvidoni.

No problem at all. I can wait.

Best,

Answer 40 · 2022-05-23T21:19:03.000Z

@ropensci-review-bot assign @thisisnic as reviewer

Answer 41 · 2022-05-23T21:19:07.000Z

@thisisnic added to the reviewers list. Review due date is 2022-06-13. Thanks @thisisnic for accepting to review! Please refer to our reviewer guide.

Answer 42 · 2022-05-23T21:19:08.000Z

@thisisnic: If you haven't done so, please fill this form for us to update our reviewers records.

Answer 43 · 2022-05-29T09:25:07.000Z

title: “review”
output:
rmarkdown::md_document:
pandoc_args: [
“–wrap=none”
]

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Briefly describe any working relationship you have (had) with the package authors. None
☒ As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

☐ A statement of need: clearly stating problems the software is designed to solve and its target audience in README. While a readme and pkgdown website exists, I believe that the documentation could be greatly improved - see my comments below.
☒ Installation instructions: for the development version of package and any non-standard dependencies in README
☒ Vignette(s): demonstrating major functionality that runs successfully locally Vignette runs locally.
☒ Function Documentation: for all exported functions
☒ Examples: (that run successfully locally) for all exported functions
☒ Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R). All available.

You can find more comments on documentation below.

Functionality

☐ Installation: Installation succeeds as documented on my windows machine. On github actions, the cmd check currently fails for mac.
☒ Functionality: Any functional claims of the software been confirmed.
☒ Performance: Any performance claims of the software been confirmed.
There are no performance claims made in the package.
☒ Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions.
All tests pass on the local machine. But coverage is below 80% - I would love to see this go above 95% :)
☐ Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.
- ☐ package name: maybe a search engine optimized name for the package would be yfinanceR - the name of the equivalent Python package is yfinance
- ☐ create package metadata with codemeta
- ☐ As per the CMD check failure for Mac, the package currently does not run for Mac.
- ☒ functions have descriptive names; snake case; function argument order consistent across functions; no conflicts with base packages
- ☒ console messages via error() and warning()
- ☐ no citation file is included
- ☐ (consider creating a r-universe profile & adding a r-universe badge, r-universe is great :) )
- ☐ add installation instructions using remotes, pak, etc
- ☒ code of conduct, contribution guidelines available
- ☐ the package does not contain top-level documentation - i.e. ??yfR does not return any documentation of the package
- ☐ add a link to the readme with further extensive documentation on `yahoo finance
- ☐ "If your package provides access to a data source, we require that DESCRIPTION contains both (1) A brief identification and/or description of the organisation responsible for issuing data; and (2) The URL linking to public-facing page providing, describing, or enabling data access (which may often differ from URL leading directly to data source)."
- ☒ roxygen2 is used
- ☐ in general, @return statements specify the returned data object. But you could be more specific - usually, tibbles are returned, not base `data.frames
- ☒ @noRD used for non-exported functions
- ☒ pkgdown website exists
- ☒ license: MIT
- ☒ all user facing functions have examples
- ☐ package dependencies: by running pkgstats, it looks like there are multiple package dependencies that you could easily replace by using base functions, e.g. dplyr, magrittr, tibble. What is the advantage of using readr::read_rds() over base::readRDS()? The packages curl, cli, is not detected in use by pkgstats. Nevertheless, all imported packages are of high quality, so I have no concerns here.

Estimated hours spent reviewing: 8

☒ Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer (“rev” role) in the package DESCRIPTION file.

Additional Comments

I think that yfR is a very promising package with useful features, and I believe that it will be widely used. I very much enjoyed using it! To improve the package, I mostly suggest to invest more time into refining the documentation.

Documentation

Statement of need: I would like to see a more refined statement of need at the beginning of the readme: what is yfR’s main innovation? E.g. start with something like “yfR is an API to yahoo finance. It speeds up the data downloading process by parallel computing and local caching.” Then explain what type of data yahoo finance includes.
I would move the discussion of data quality / limitations of yahoo finance and comparison to BatchGetSymbols to separate articles - I don’t think they are required in the readme.
If you want to keep the reference to quantmod, maybe include a dedicated ‘Acknowledgements’ section at the end of the readme?
Occasionally, you use jargon: e.g., not all users might now what a ticker is.
I would move all examples from the readme to the ‘get started’ vignette. Alternatively, I would keep only one example in the readme.
In the ‘get started’ vignette, I would hide the message output generated e.g. by yf_get() and explain in words what the function does: e.g. it checks the cache, downloads data if the cache is empty, else finishes etc.
The vignette states that multiple ‘collections’ are organized in the package. It would be great to include a full list of collections to the docs, e.g. as a separate article? The yf_get_available_collections() helps here, but what do the individual collections stand for? E.g. does IBOV stand for the Bovespa-Index?
I would like to see some documentation on how the caching works: e.g., where are files saved? For how long are they saved? Is the cache ever cleaned, e.g. are cached files lost by re-starting the R session?
In the docs for yf_convert_to_wide, it would be good to print the initial long dataframe.
The documentation of yf_get() does not really, as a stand-alone, explain what the function does: download ticker data from yahoo finance, caching, parallelism etc. I would delete the reference to getSymbols. Note that as yf_get_default_cache_folder() is not exported, users will run into an error when trying yfr::yf_get_default_cache_folder(). Also, mention that the ticker function argument is vectorized
You could improve the documentation for parallelism: I myself have never used furrr, so your hint to furrr::plan() is not too helpful. How about a dedicated article with a small example that illustrates how to run get_plan() in parallel? Also, I only learned from browsing the code that by default, half of all available cores are used.
What is the difference between a collection and an index?
Consider adding documentation of the data returned via yf_get(). Not being a financial economist, I for example have no idea what the price_adjusted column stands for. Beyond, what is the unit of measurement of the price variables? I suppose it is US Dollars? Further, what is the relationship between daily data and monthly data? Also, potentially add a note that when markets are closed, no data row will be created.
examples could be more 'verbose', i.e. add documentation
also, examples could be more 'exhaustive' - they are quite minimal at the moment
the example for yf_convert_to_wide currently calls internal data - could you not simply attach the data set or load it?

Installation, Local CMD Check & pkgcheck

Installation and CMD check pass without problems
I tried to run pkgcheck, but failed to get it to run. I suggest to run the pkgcheck action on github actions, at least for the time of the review.

Testing

Code Coverage is currently only at around 80% - I would love to see this up at 95%, if not 100 :)

Functionality

All examples work very nicely. Overall, it was a lot of fun using the package!
In general, the console output is very helpful and very pretty!
I am not sure if I would have default function arguments for first_date() and last_date(). If you want to keep it, I would change it from 15 days to one month.
yf_convert_to_wide() is super helpful - great idea to directly include it in the package!
Could the API be more permissive, e.g. accept dates with format dd-mm-yyyy?
When trying the “SP500” collection example, I ran into several ‘error in download’ errors. Still, the function finished eventually with ‘binding price data’. What exactly is going on here? Did the function eventually manage to fetch all tickers? If no, could there be a final message, e.g. ‘300/500 tickers successfully fetched. To fetch all others, do this …’.
I have seen that there is already a PR opened to alert users when they have reached the yahoo finance limit. This is would indeed be a great feature!

Additional Functionality

It would be great to add further collections, e.g. NASDAQ, DAX, SP30, FAANG etc
The equivalent Python package, yfinance offers a range of additional functionality, e.g. data on dividents, stock splits, and institutional investors. Do you plan to incorporate any of these into the package in the future?
Currently, the cached files are saved in the rds file format via readr::read_rds(). There might be faster and/or more memory-friendly alternatives available. Have you considered adding a function argument that would allow users to store files e.g. in the parquet file format?
Have you considered to integrate an autoplot functions to plot stock prices. autoplot would e.g. generate plots similarly to those created in the readme / vignette.
Would it be possible to give an estimate of consumed memory of all cached files prior to a download? I would also consider to export yf_get_default_cache_folder() so that users are aware of the function and can easily check where yfR creates the cache.

Misc

Do you need to export the magrittr pipe when using it internally?
I took a brief glance at the error messages, and most of them are clear and easy to understand. Maybe you could rephrase

  # check for NA
  if (any(is.na(tickers))) {
    my_msg <- paste0(
      "Found NA value in ticker vector.",
      "You need to remove it before running BatchGetSymbols."
    )
    stop(my_msg)
  }
  
    if (class(first_date) != "Date") {
    stop("ERROR: cant change class of first_date to 'Date'")
  }

In general, I really like the dreamerr package for function input type checks. checkmate seems to be very popular, too.

With dreamerr, you could e.g. write

  # check threshold
  if ((thresh_bad_data < 0) | (thresh_bad_data > 1)) {
    stop("Input thresh_bad_data should be a proportion between 0 and 1")
  }

as

dreamerr::check_arg(thresh_bad_data, "scalar numeric GT{0} LT{1}")

I can’t really follow this error message:

  if (!flag) {
    warning(stringr::str_glue(
      "\nIt seems you are using a non-default cache folder at {cache_folder}. ",
      "Be aware that if any stock event -- split or dividend -- happens ",
      "in between cache files, the resulting aggregate cache data will not ",
      "correspond to reality as some part of the price data will not be ",
      "adjusted to the event. For safety and reproducibility, my suggestion ",
      "is to use cache system only for the current session with tempdir(), ",
      "which is the default option."
    ))
  }

The collections are created via hard coded (wikipedia) URLs. This is likely prone to errors - what if e.g. the URLs change? I understand the attractiveness of this ‘dynamic’ lookup, as e.g. the composition of stock indices might change over time. Maybe you could add a second look-up link (in case the main URL breaks), or you could add a ‘fallback’ data.frame containing the names of all firms included in an index at a fixed date to fall back to? See also this link on potential error handling of URLs via tryCatch.
My last comment (repeating something I mentioned above): the equivalent python package is called yfinance. Maybe a better / SEO optimized name for the package would be yfinanceR?

Answer 44 · 2022-05-30T11:03:47.000Z

Thanks @s3alfisc for the review! Appreciate it. Good ideas there.

I'll reply to all your comments in the next couple of days.

Answer 45 · 2022-05-30T21:01:41.000Z

@ropensci-review-bot submit review #523 (comment) time 8

Answer 46 · 2022-05-30T21:01:44.000Z

Logged review for s3alfisc (hours: 8)

Answer 47 · 2022-06-01T18:10:37.000Z

Dear @s3alfisc , please find my replies below:

I think that yfR is a very promising package with useful features, and I believe that it will be widely used. I very much enjoyed using it! To improve the package, I mostly suggest to invest more time into refining the documentation.

Thanks, appreciate the feedback and the detailed review. Given your feedback and ideas, I've made many changes in the code and documentation.

Documentation

Statement of need: I would like to see a more refined statement of need at the beginning of the readme: what is yfR’s main innovation? E.g. start with something like “yfR is an API to yahoo finance. It speeds up the data downloading process by parallel computing and local caching.” Then explain what type of data yahoo finance includes.

Also thanks. I changed the readme.rmd file so that the reader can quickly grasp how to use the package.

I would move the discussion of data quality / limitations of yahoo finance and comparison to BatchGetSymbols to separate articles - I don’t think they are required in the readme. If you want to keep the reference to quantmod, maybe include a dedicated ‘Acknowledgements’ section at the end of the readme? Occasionally, you use jargon: e.g., not all users might now what a ticker is. I would move all examples from the readme to the ‘get started’ vignette. Alternatively, I would keep only one example in the readme.

I reorganized the topics in the readme.rmd and moved some as vignettes.

In the ‘get started’ vignette, I would hide the message output generated e.g. by yf_get() and explain in words what the function does: e.g. it checks the cache, downloads data if the cache is empty, else finishes etc.

I rather keep the yfR messages in the vignettes as they mimic the actual call to the function. I also improved the text in the main vignette ("get started").

The vignette states that multiple ‘collections’ are organized in the package. It would be great to include a full list of collections to the docs, e.g. as a separate article? The yf_get_available_collections() helps here, but what do the individual collections stand for? E.g. does IBOV stand for the Bovespa-Index?

Great idea. I added argument print_description for yf_get_available_collections() for printing a text description of available collections:

I would like to see some documentation on how the caching works: e.g., where are files saved? For how long are they saved? Is the cache ever cleaned, e.g. are cached files lost by re-starting the R session?

I added a section at the help file of yf_get(), explaining how the cache system works.

In the docs for yf_convert_to_wide, it would be good to print the initial long dataframe.

Done.

The documentation of yf_get() does not really, as a stand-alone, explain what the function does: download ticker data from yahoo finance, caching, parallelism etc. I would delete the reference to getSymbols. Note that as yf_get_default_cache_folder() is not exported, users will run into an error when trying yfr::yf_get_default_cache_folder().

Documentation was improved.

Also, mention that the ticker function argument is vectorized

Done.

You could improve the documentation for parallelism: I myself have never used furrr, so your hint to furrr::plan() is not too helpful. How about a dedicated article with a small example that illustrates how to run get_plan() in parallel? Also, I only learned from browsing the code that by default, half of all available cores are used.

I think that going into parallelism and furrr::plan() would be off topic. However, I added a link to furrr https://furrr.futureverse.org/ in argument do_parallel, so that the user can learn more about it, if desired.

What is the difference between a collection and an index?

A collection is just a bunch of tickers put together. An index can be a collection, but not all collections are indices.

Consider adding documentation of the data returned via yf_get(). Not being a financial economist, I for example have no idea what the price_adjusted column stands for. Beyond, what is the unit of measurement of the price variables? I suppose it is US Dollars? Further, what is the relationship between daily data and monthly data? Also, potentially add a note that when markets are closed, no data row will be created.

Done. New documentation is available at readme.rmd and also in help for yf_get().

examples could be more 'verbose', i.e. add documentation also, examples could be more 'exhaustive' - they are quite minimal at the moment the example for yf_convert_to_wide currently calls internal data - could you not simply attach the data set or load it?

I revised all examples, specially for the main function. I've made a few changes, but they look alright to me. Users can always check the vignettes for more details.

Installation, Local CMD Check & pkgcheck

Installation and CMD check pass without problems. I tried to run pkgcheck, but failed to get it to run. I suggest to run the pkgcheck action on github actions, at least for the time of the review.

I also failed to use pkgcheck on linux ubuntu/mint. I can't install its dependencies, despinte spending some time trying hard.

Testing

Code Coverage is currently only at around 80% - I would love to see this up at 95%, if not 100 :)

I tried my best to cover as much as possible, reaching 82,99%. One big miss is in the parallel computing part which, in the current version is not active (I removed it due to YF limits in the api call). There is a fix in course, but it depends on quantmod being in CRAN. I'll add the parallel tests once it is fixed.

The rest is just input error checking which, to me, fells fine to be uncovered (covering them would just be a gimmick). So, I'll not reach 100%, but will be close.

Functionality

All examples work very nicely. Overall, it was a lot of fun using the package! In general, the console output is very helpful and very pretty!

Great, thanks!

I am not sure if I would have default function arguments for first_date() and last_date(). If you want to keep it, I would change it from 15 days to one month.

Done.

yf_convert_to_wide() is super helpful - great idea to directly include it in the package!

Thanks. I know some people use the data that way, even though I dont like it..

Could the API be more permissive, e.g. accept dates with format dd-mm-yyyy?

I feel that ISO format is fine. This is the standard in R and users should probably adapt to it.

When trying the “SP500” collection example, I ran into several ‘error in download’ errors. Still, the function finished eventually with ‘binding price data’. What exactly is going on here? Did the function eventually manage to fetch all tickers? If no, could there be a final message, e.g. ‘300/500 tickers successfully fetched. To fetch all others, do this …’.

Good idea. I implemented the message. The user will now be aware of the relative percentage of tickers in the output data, when comparing to the requested vector of tickers. Whenever that is lower than 50%, a message tells the user to wait for 15 minutes before running it again.

Good

Bad

I have seen that there is already a PR opened to alert users when they have reached the yahoo finance limit. This is would indeed be a great feature!

We are working on this issue, already with a viable solution that should become official soon. Nonetheless, the package works fine in a single session in all my tests.

Additional Functionality

It would be great to add further collections, e.g. NASDAQ, DAX, SP30, FAANG etc

Yes! Definitely. The idea is having something for everyone..

The equivalent Python package, yfinance offers a range of additional functionality, e.g. data on dividents, stock splits, and institutional investors. Do you plan to incorporate any of these into the package in the future?

No. My proposal is focusing on stock data importating and organization, only.

Currently, the cached files are saved in the rds file format via readr::read_rds(). There might be faster and/or more memory-friendly alternatives available. Have you considered adding a function argument that would allow users to store files e.g. in the parquet file format?

I believe that .rds files works fine for yfR (I never saw a performance issue). But I'll keep that in mind.
Also, this is very easy to change in the future.

Have you considered to integrate an autoplot functions to plot stock prices. autoplot would e.g. generate plots similarly to those created in the readme / vignette.

No, but I'll also keep it in mind.

Would it be possible to give an estimate of consumed memory of all cached files prior to a download? I would also consider to export yf_get_default_cache_folder() so that users are aware of the function and can easily check where yfR creates the cache.

Probably, but I fell that file size is not really an issue. The cache files are really small.

Nonetheless, I added a "Diagnostics" text at the end of the execution of yf_get. It includes the current size of cache files (see previous figure with output "Diagnostics").

Also, function yf_get_default_cache_folder() is now exported and available to users.

Misc

Do you need to export the magrittr pipe when using it internally?

This was implemented so yfR is compatible with R >= 4.0.0 (personally I preffer the new pipe).

I was not aware that exporting it is unecessary (I simply used usethis::use_pipe() when creating the package). I also feel that no harm is done in allowing the user access to the pipe when loading yfR (I'm not aware of any conflicts).

I took a brief glance at the error messages, and most of them are clear and easy to understand. Maybe you could rephrase

Thanks, I fixed that.

In general, I really like the dreamerr package for function input type checks. checkmate seems to be very popular, too.

Thanks for the suggestion. I was not aware of this package. I'll have a look but, for the time being, I'll stay with the current code.

I can’t really follow this error message: "\nIt seems you are using a non-default cache folder at {cache_folder}. ",

I tried my best, but the explanation is more technical than what I can put in a message. What the user should know is that, for stocks, there is no garantee that cache files can be merged without problems. This happens because external events such as dividends, can alter the adjusted prices recursively. So, you can get a different adjusted price for the same ticker/day if the query is made in different days.

I changed the text so that the explanation is more clear.

The collections are created via hard coded (wikipedia) URLs. This is likely prone to errors - what if e.g. the URLs change? I understand the attractiveness of this ‘dynamic’ lookup, as e.g. the composition of stock indices might change over time. Maybe you could add a second look-up link (in case the main URL breaks), or you could add a ‘fallback’ data.frame containing the names of all firms included in an index at a fixed date to fall back to? See also this link on potential error handling of URLs via tryCatch.

The fallback dataframe is a great idea and I implemented it. I don't like the first one of a "backup" url as requires more webscrapping code, which can be very unstable and hard to maintain.

I also implemented argument force_fallback in yf_get_index_comp, which allows the user to read the offlines files directly.

My last comment (repeating something I mentioned above): the equivalent python package is called yfinance. Maybe a better / SEO optimized name for the package would be yfinanceR?

I really liked the name yfR. Its short and easy to remember. But thanks for the suggestion.

Answer 48 · 2022-06-01T18:24:27.000Z

All changes are in the main branch..

Answer 49 · 2022-06-04T17:47:16.000Z

I am currently working on my review of this package, and hope to finish it in the next few days if nothing unexpected comes up! I had an issue when I was running the examples in the vignette though, and so to deliver partial feedback which might be useful in the meantime, I've opened this issue relating to it on the project repo: ropensci/yfR#11

Answer 50 · 2022-06-05T20:25:24.000Z

I am currently working on my review of this package, and hope to finish it in the next few days if nothing unexpected comes up! I had an issue when I was running the examples in the vignette though, and so to deliver partial feedback which might be useful in the meantime, I've opened this issue relating to it on the project repo: msperlin/yfR#11

Hello Nicola, that's great, thank you!

Answer 51 · 2022-06-06T19:44:01.000Z

Thanks @thisisnic,

No rush for the review. I'm also on holidays and might take a while to reply here.

As for the issue, I think you might have a problem with quantmod. Please update with latest version in CRAN and try again.

Answer 52 · 2022-06-06T21:05:29.000Z

Thanks @msperlin - no problem! Great, that fixes it for me! Would it be worth perhaps adding the version to the DESCRIPTION file as described here[1] in order to prevent other people with older package versions running into the same error, and wondering why it's not working?

[1] https://r-pkgs.org/description.html#minimum-versions

Answer 53 · 2022-06-06T21:40:19.000Z

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Briefly describe any working relationship you have (had) with the package authors.
As the reviewer I confirm that there are no conflicts of interest for me to review this work (if you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

A statement of need: clearly stating problems the software is designed to solve and its target audience in README
Installation instructions: for the development version of package and any non-standard dependencies in README
Vignette(s): demonstrating major functionality that runs successfully locally
Function Documentation: for all exported functions
Examples: (that run successfully locally) for all exported functions
Community guidelines: including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

Installation: Installation succeeds as documented.
Functionality: Any functional claims of the software been confirmed.
Performance: Any performance claims of the software been confirmed.
Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
Packaging guidelines: The package conforms to the rOpenSci packaging guidelines. * (see below comment)

Estimated hours spent reviewing: 4

Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

Overall, I think this is an excellent package and I especially appreciate that it both retrieves the data and puts it into tidy format - as someone who's taught workshops in the past, this kind of thing is really helpful!

As @s3alfisc did such a good job of covering areas around design and possible extensions, I thought I'd instead focus on code review, and areas that I might tweak to ensure the code is as maintainable as possible. A lot of the comments below are stylistic, or just questions, and so feel free to take or leave as many of them as you wish.

The only point which I think definitely does need addressing is mentioned in the comment above which references an issue opened on the repo - adding a minimum version of quantmod to the DESCRIPTION file, as otherwise the examples shown in the vignette fail to run correctly.

This is my first review, so I hope this is as expected, and if there are any areas which I've missed or focussed on too much, I'm happy to receive feedback on that.

Comments on `yf_get.R`

This is well documented. There are places where I might choose to shorten parameter names to make them more easily memorable, but this is a personal preference rather than a recommendation. The code is skimmable, the comments help guide me through what's happening, variable names are intuitive, and error messages are informative.

A few typos: "cant" instead of "can't".

Some error messages have the text "ERROR" in them - I'd recommend removing these, as this is duplicating the in-built error text, e.g. it outputs as Error: ERROR: cant change class of last_date to 'Date'

I see some code which edits a dplyr option: options(dplyr.summarise.inform = FALSE) and later resets it: options(dplyr.summarise.inform = TRUE). Could it perhaps be useful to the user if you were to check if it is already set or set to a different value, and use a call to on.exit() to reset it to its previous value later on, to prevent changing their chosen setting without them knowing? [Edited to add: you can also do this via withr::local_options()]

There is some commented out code in yf_get.R (lines 270 - 273) - can this be removed?

This function does a lot of things, and I might consider abstracting out some of the logic into smaller helper functions in future to make it easier to test in a more modular way. However, I don't think this is a problem here, as the code is well commented and flows in a logical order.

Comments on `collections.R`

Again, well commented and well documented, which is great.

It looks like yf_get_available_collections() just calls yf_get_available_indices() and returns the results - could you instead define it more simply, e.g. yf_get_available_collections() <- yf_get_available_indices()? And, what's the reason for having both? (This function changed slightly between when I first reviewed this and later came back to it, but I think the question is still relevant)

yf_collection_get() - it says it returns a "dataframe" - perhaps this could be either "data frame" (the concept) or "data.frame" (the object type) for more clarity?

Quick question - any reason this file doesn't have the yf prefix, but the other ones do? Not important, just curious.

Another tiny comment - I wonder if yf_get_collection() might feel like a better fit with the other function names than yf_collection_get()?

Comments on `yf_convert_to_wide.R`

Chunk of whitespace within the definition of fct_format_wide() which could be removed.

Comments on `yf_get_clean_data.R`

Some commented out code here - can this be removed?

Comments on `yf_get_single_ticker.R`

Love the use of the cli package to add messages to really help the user see what's going on.

Comments on "getting started" vignette

In the plot comparing the daily/weekly/monthly/yearly data, I might consider removing geom_point() and just using geom_line() as during a naive look at the plot, I wondered what the thickness of the lines meant before realising that the points had blurred together. I might also use breaks on the x axis with a higher frequency than five years.

Answer 54 · 2022-06-06T21:49:29.000Z

@ropensci-review-bot submit review #523 (comment) time 4

Answer 55 · 2022-06-06T21:49:32.000Z

Logged review for thisisnic (hours: 4)

Answer 56 · 2022-06-07T14:20:51.000Z

Dear @thisisnic, please find my replies below.

Again, thanks for the time reviewing my package. I've made many changes based on your comments. All changes are in the main branch.

Review Comments

Overall, I think this is an excellent package and I especially appreciate that it both retrieves the data and puts it into tidy format - as someone who's taught workshops in the past, this kind of thing is really helpful!

Thanks! I'm glad you enjoyed using it.

As @s3alfisc did such a good job of covering areas around design and possible extensions, I thought I'd instead focus on code review, and areas that I might tweak to ensure the code is as maintainable as possible. A lot of the comments below are stylistic, or just questions, and so feel free to take or leave as many of them as you wish.

The only point which I think definitely does need addressing is mentioned in the comment above which references an issue opened on the repo - adding a minimum version of quantmod to the DESCRIPTION file, as otherwise the examples shown in the vignette fail to run correctly.

Yes, this is fixed in the latest commit 2749f6fbdad562fe6f963e409470916d58f6fdb2.

This is my first review, so I hope this is as expected, and if there are any areas which I've missed or focussed on too much, I'm happy to receive feedback on that.

Comments on yf_get.R

This is well documented. There are places where I might choose to shorten parameter names to make them more easily memorable, but this is a personal preference rather than a recommendation. The code is skimmable, the comments help guide me through what's happening, variable names are intuitive, and error messages are informative.

Thanks.

A few typos: "cant" instead of "can't".

Fixed.

Some error messages have the text "ERROR" in them - I'd recommend removing these, as this is duplicating the in-built error text, e.g. it outputs as Error: ERROR: cant change class of last_date to 'Date'

Fixed.

I see some code which edits a dplyr option: options(dplyr.summarise.inform = FALSE) and later resets it: options(dplyr.summarise.inform = TRUE). Could it perhaps be useful to the user if you were to check if it is already set or set to a different value, and use a call to on.exit() to reset it to its previous value later on, to prevent changing their chosen setting without them knowing? [Edited to add: you can also do this via withr::local_options()]

You're right. I changed the code so that the user choice is always respected.

There is some commented out code in yf_get.R (lines 270 - 273) - can this be removed?

Done.

This function does a lot of things, and I might consider abstracting out some of the logic into smaller helper functions in future to make it easier to test in a more modular way. However, I don't think this is a problem here, as the code is well commented and flows in a logical order.

I agree. It is becoming a long code. I'll keep that in mind.

Comments on collections.R

Again, well commented and well documented, which is great.

Thanks!

It looks like yf_get_available_collections() just calls yf_get_available_indices() and returns the results - could you instead define it more simply, e.g. yf_get_available_collections() <- yf_get_available_indices()? And, what's the reason for having both? (This function changed slightly between when I first reviewed this and later came back to it, but I think the question is still relevant)

The idea is for later to have collections that are not indices. This is way I broke it into separate functions.

yf_collection_get() - it says it returns a "dataframe" - perhaps this could be either "data frame" (the concept) or "data.frame" (the object type) for more clarity?

Done.

Quick question - any reason this file doesn't have the yf prefix, but the other ones do? Not important, just curious.

My mistake. Its fixed.

Another tiny comment - I wonder if yf_get_collection() might feel like a better fit with the other function names than yf_collection_get()?

I followed the convention of using "object""verb" when naming functions. In fact, I just realized I used the wrong convention in some functions.
For consistency, I changed all names to the "object""verb" pattern except yf_get(), which I feel is short, concise and intuitive.

Comments on yf_convert_to_wide.R

Chunk of whitespace within the definition of fct_format_wide() which could be removed.

Done.

Comments on yf_get_clean_data.R

Some commented out code here - can this be removed?

Done.

Comments on yf_get_single_ticker.R

Love the use of the cli package to add messages to really help the user see what's going on.

Thanks. I really like it too.

Comments on "getting started" vignette

In the plot comparing the daily/weekly/monthly/yearly data, I might consider removing geom_point() and just using geom_line() as during a naive look at the plot, I wondered what the thickness of the lines meant before realising that the points had blurred together. I might also use breaks on the x axis with a higher frequency than five years.

Agreed. To fix it, I increased the time time period and removed the geom_point() layer.

Answer 57 · 2022-06-16T22:56:49.000Z

@thisisnic and @s3alfisc what are your thoughts on the changes?
Do let me know if you are satisfied with these changes, or if there is anything else outstanding.

Answer 58 · 2022-06-17T12:39:55.000Z

Hi @melvidoni, I will try to take a closer look at @msperlin's comments this weekend / early next week =)

Answer 59 · 2022-06-20T08:49:01.000Z

Thanks for the reminder there @melvidoni - I am happy with the changes made.

Answer 60 · 2022-06-20T21:07:58.000Z

Thanks @thisisnic! We will wait until after the weekend for @s3alfisc's comments then!

Answer 61 · 2022-06-21T20:32:05.000Z

Hi all, I am very happy with the changes. The package looks and works great! In particular, I am super happy about how the documentation has evolved! :)

Answer 62 · 2022-06-21T21:10:24.000Z

@ropensci-review-bot approve yfR

Answer 63 · 2022-06-21T21:10:27.000Z

Approved! Thanks @msperlin for submitting and @s3alfisc, @thisisnic for your reviews! 😁

To-dos:

Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You will need to enable two-factor authentication for your GitHub account.
This invitation will expire after one week. If it happens write a comment @ropensci-review-bot invite me to ropensci/<package-name> which will re-send an invitation.
After transfer write a comment @ropensci-review-bot finalize transfer of <package-name> where <package-name> is the repo/package name. This will give you admin access back.
Fix all links to the GitHub repo to point to the repo under the ropensci organization.
Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file
If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
- deactivate the automatic deployment you might have set up
- remove styling tweaks from your pkgdown config but keep that config file
- replace the whole current pkgdown website with a redirecting page
- replace your package docs URL with https://docs.ropensci.org/package_name
- In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar, https://github.com/ropensci/foobar
Fix any links in badges for CI and coverage to point to the new repository URL.
Increment the package version to reflect the changes you made during review. In NEWS.md, add a heading for the new version and one bullet for each user-facing change, and each developer-facing change that you think is relevant.
We're starting to roll out software metadata files to all rOpenSci packages via the Codemeta initiative, see https://docs.ropensci.org/codemetar/ for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.
You can add this installation method to your package README install.packages("<package-name>", repos = "https://ropensci.r-universe.dev") thanks to R-universe.

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. She will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

Answer 64 · 2022-06-22T11:03:43.000Z

Thanks @s3alfisc, @thisisnic! Realy appreciate the feedback.

I'll make all changes and report here soon.

Answer 65 · 2022-06-22T11:09:42.000Z

@ropensci-review-bot finalize transfer of yfR

Answer 66 · 2022-06-22T11:09:45.000Z

Transfer completed.
The yfR team is now owner of the repository and the author has been invited to the team

Answer 67 · 2022-06-22T11:37:16.000Z

@s3alfisc, @thisisnic Can I include you both as reviewers in the DESCRIPTION file?

Answer 68 · 2022-06-22T12:11:02.000Z

Of course, @msperlin , thanks!

Answer 69 · 2022-06-22T12:29:53.000Z

@melvidoni I've made all required changes. Please let me know if anything else is needed.
I'm also interested in a blog post @ropensci/blog-editors

Answer 70 · 2022-06-22T14:01:38.000Z

Same with me @msperlin, and thanks! :)

Answer 71 · 2022-06-26T20:28:31.000Z

@melvidoni I've made all required changes. Please let me know if anything else is needed. I'm also interested in a blog post @ropensci/blog-editors

Excellent, thanks. Please, tick all the boxes in the corresponding post if you can.

Answer 72 · 2022-06-27T11:38:22.000Z

@melvidoni I've made all required changes. Please let me know if anything else is needed. I'm also interested in a blog post @ropensci/blog-editors

Excellent, thanks. Please, tick all the boxes in the corresponding post if you can.

@melvidoni
I can't tick the boxes in the original post. Should I write a new reply?

Answer 73 · 2022-06-27T20:55:58.000Z

No worries, then it's fine @msperlin

Answer 74 · 2022-06-27T20:57:52.000Z

I'm planning to submit to CRAN tomorrow. Is that Ok? (sorry but I did not see any guideline about when to submit to cran)

Answer 75 · 2022-06-27T21:04:34.000Z

It should be okay.

Answer 76 · 2022-06-30T14:28:20.000Z

Hi @msperlin,

We'd love to have a blog post about yfR, thank you for offering!

We publish two types of articles, blog posts and technotes. Let us know which type of article you'd like to write and when you think you might have it ready by, and we'll pencil in a publication date that will give us time for a quick review (the date can always change if things go faster or slower than expected).

We have a guide for blog authors (https://blogguide.ropensci.org/) and when you're ready and have a pull request, just set me as the reviewer or ping me and I'll come by to give you some feedback.

Let me know if you have any questions, thanks!

Answer 77 · 2022-06-30T20:50:09.000Z

Thanks @steffilazerte. I'll have a look on the material you posted. But, I will need some time, maybe a couple of weeks to write the post. I'll let you know.

Answer 78 · 2022-06-30T21:13:40.000Z

Perfectly reasonable 😁

Answer 79 · 2022-07-13T17:26:31.000Z

@steffilazerte I wrote a blog post. Just send you the PR and set your as reviewer.

Archive: TBD Version accepted: TBD Language: en

Scope

Technical checks

Publication options

Code of conduct

Checks for yfR (v0.0.1)

1. Statistical Properties

1a. Network visualisation

2. goodpractice and other checks

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

Editor-in-Chief Instructions:

Checks for yfR (v0.0.1)

1. Package Dependencies

2. Statistical Properties

2a. Network visualisation

3. goodpractice and other checks

3a. Continuous Integration Badges

3b. goodpractice results

R CMD check with rcmdcheck

Test coverage with covr

Cyclocomplexity with cyclocomp

Static code analyses with lintr

Editor-in-Chief Instructions:

Package Review

Documentation

Functionality

Additional Comments

Documentation

Installation, Local CMD Check & pkgcheck

Testing

Functionality

Additional Functionality

Misc

Documentation

Installation, Local CMD Check & pkgcheck

Testing

Functionality

Additional Functionality

Misc

Package Review

Documentation

Functionality

Review Comments

Comments on yf_get.R

Comments on collections.R

Comments on yf_convert_to_wide.R

Comments on yf_get_clean_data.R

Comments on yf_get_single_ticker.R

Comments on "getting started" vignette

Review Comments

Comments on yf_get.R

Comments on collections.R

Comments on yf_convert_to_wide.R

Comments on yf_get_clean_data.R

Comments on yf_get_single_ticker.R

Comments on "getting started" vignette

Archive: TBD
Version accepted: TBD
Language: en

2. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck

3. `goodpractice` and other checks

3b. `goodpractice` results

`R CMD check` with rcmdcheck

Comments on `yf_get.R`

Comments on `collections.R`

Comments on `yf_convert_to_wide.R`

Comments on `yf_get_clean_data.R`

Comments on `yf_get_single_ticker.R`