Unix4j sed has nothing in common with Unix sed

Question

Unix4j sed has nothing in common with Unix sed

Closed this issue 4 years ago · 1 comments

I believe Unix4j is promising too much for "sed"
Implementation of Unix command line tools in Java. You can use the commands that you know from Unix in a Java program---you can pipe the results of one command to another as you know it from Unix.

The standard Unix "sed" options are listed in PS. Unix4j is implementing:
-n

and other Unix4j sed options:
-g
-p
-l (misleading, every sed use it as line-length parameter)
-I
-s
-a
-i (misleading, every sed use it as in-place replacement)
-c
-d
-y

have nothing to do with an Unix sed. Or am I issing something? Could you please document, how to achieve a simple Unix sed search/in-place-replacement option "-i" in a file with Unix4j :

cat test.txt
Unix4j is not such a great tool as I thought.
sed -i 's/is not/is/' test.txt
cat test.txt
Unix4j is such a great tool as I thought.

or let me know if you have any plans to add an Unix "-i" sed option into unix4j package soon?

I you have no such plans, please list only those unix4j commands on your project page, which are implemented as a Unix command (like unix4j.grep).

Thanks
siarsky

PS:

$ sed --help
Usage: sed [OPTION]... {script-only-if-no-other-script} [input-file]...
-n, --quiet, --silent suppress automatic printing of pattern space
-e script, --expression=script add the script to the commands to be executed
-f script-file, --file=script-file add the contents of script-file to the commands to be executed
--follow-symlinks follow symlinks when processing in place
-i[SUFFIX], --in-place[=SUFFIX] edit files in place (makes backup if SUFFIX supplied)
-b, --binary open files in binary mode (CR+LFs are not processed specially)
-l N, --line-length=N specify the desired line-wrap length for the `l' command
--posix disable all GNU extensions.
-E, -r, --regexp-extended use extended regular expressions in the script (for portability use POSIX -E).
-s, --separate consider files as separate rather than as a single, continuous long stream.
--sandbox operate in sandbox mode.
-u, --unbuffered load minimal amounts of data from the input files and flush the output buffers more often
-z, --null-data
separate lines by NUL characters
--help display this help and exit
--version output version information and exit

Answer 1 · 2021-01-09T04:27:18.000Z

Hi Siarsky,

Thanks for your question and your feedback.

Firstly, yes there will be some difference to unix commands as you know them. We tried to stick closely to a very minimal definition of the commands, but even there we have some deviation because unix4j uses some different concepts.

Firstly, we used the following command doc as a basis for our commands:
https://pubs.opengroup.org/onlinepubs/009695399/utilities/sed.html

Note that some of the letters above are referred to as functions in the sed documentation, but implemented as options in unix4j.

In-place replacement is not currently directly supported by unix4j as this is a bit tricky to implement and typically requires random-access to files and/or in memory buffering of parts of the file, which unix4j does not implement (not sure how it is implemented internally in unix variants, they may also create a new temp file first and rename it at the end).

As a final note, please feel free to use unix4j as is if you like it (as many others do btw). If you do not like it then you have basically 3 options: (1) provide a pull request with fixes/enhancements/improvements (we are always happy to review and consider pull requests), (2) write your own tool (e.g. port a unix c source to java if you like), or (3) simply don't use it and look for another tool or implementation.

Thanks for your honest feedback.
Regards, Marco