datapasta 2.0.0 'Fusilli Jerry'
The Goods
Introducing datapasta
datapasta
is about reducing resistance associated with copying and pasting data to and from R. It is a response to the realisation that I often found myself using intermediate programs like Sublime to munge text into suitable formats. Addins and functions in datapasta
support a wide variety of input and output situations, so it (probably) "just works". Hopefully tools in this package will remove such intermediate steps and associated frustrations from our data slinging workflows.
Prerequisites
- Linux users will need to install either
xsel
orxclip
. These applications provide an interface to X selections (clipboard-like).- For example:
sudo apt-get install xsel
- it's 72kb...
- For example:
- Windows and MacOS have nothing extra to do.
Installation
- Get the package:
install.packages("datapasta")
- Set the keyboard shortcuts using Tools -> Addins -> Browse Addins, then click Keyboard Shortcuts...
Usage
Use with RStudio
Getting data into source
At the moment this package contains these RStudio addins that paste data to the cursor:
tribble_paste
which pastes a table as a nicely formatted call totibble::tribble()
- Recommend Ctrl + Shift + t as shortcut.
- Table can be delimited with tab, comma, pipe or semicolon.
vector_paste
which will paste delimited data as a vector definition, e.g.c("a", "b")
etc.- Recommend Ctrl + Alt + Shift + v as shortcut.
vector_paste_vertical
which will paste delimited data as a vertically formatted vector definition.- Recommend Ctrl + Shift + v as shortcut
- example output:
c("Mint",
"Fedora",
"Debian",
"Ubuntu",
"OpenSUSE")
df_paste
which pastes a table on the clipboard as a standarddata.frame
definition rather than atribble
call. This has certain advantages in the context of reproducible examples and educational posts. Many thanks to Jonathan Carroll for getting this rolling and coding the bulk of the feature.- Recommend Ctrl + Alt + Shift + d as shortcut.
Getting Data out of an R session
There are two R functions available that accept R objects and output formatted text for pasting to other applications:
-
dpasta
accepts tibbles, data.frames, and vectors. Data is output in a format that matches in input class. Formatted text is pasted at the cursor. -
dmdclip
accepts the same inputs asdpasta
but inserts the formatted text onto the clipboard, preceded by 4 spaces so that is can be as pasted as a preformatted block to Github, Stackoverflow etc.
Use with other editors
The only hard dependency of datapasta
is readr
for type guessing. All the above *paste
functions can be called directly instead of as an addin, and will fall back to console output if the rsudioapi
is not available.
On system without access to the clipboard (or without clipr
installed) datapasta
can still be used to output R objects from an R session. dpasta
is probably the only function you care about in this scenario.
Custom Installation
datapasta
imports clipr
and rstudioapi
so as to make installation smooth and easy for most users. If you wish to avoid installing an rstudioapi
you will never use you can use:
install.packages("datapasta", dependencies = "Depends")
.- Followed by
install.packages("clipr")
to enable clipboard features.
Pitfalls
tribble_paste
works well with CSVs, excel files, and html tables, but is currently brittle with respect to irregular table structures like merged cells or multi-line column headings. For some reason Wikipedia seems chock full of these. :(- Quoted csv data, where the quotes contain commas will not be parsed correctly.
- Nested list columns have limited support with
tribble_paste()
nested lists will work but nestedtibbles
will be converted to list calls.
Prior art
This package is made possible by mdlincon's clipr, and Hadley's packages tibble and readr (for data-type guessing). I especially appreciate clipr's
thoughtful approach to the clipboard on Linux, which pretty much every other R clipboard package just nope'd out on.
Future developments
I am interested in expanding the types of objects supported by the output functions dpaste
and dmdclip
. Feel free to contribute your ideas to the open issues.
Bonus
0 to datapasta
in 64 seconds via a video vignette: